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ABSTRACT 


The  problem  of  restoring  images  degraded  by  blur  and 
corrupted  by  noise  is  considered  in  this  dissertation. 

The  Fredholm  integral  equation  of  the  first  kind  in  a two- 
dimensional  form  adequately  describes  the  linear  model.  A dis- 
cretization is  performed  by  using  quadrature  methods.  By  trans- 
forming the  two-dimensional  array  into  vector  format  a regression 
model  results.  The  overdetermined  and  underdetermined  cases  are 
considered  in  detail,  with  the  derivation  of  the  estimators,  their 
covariance  matrices,  confidence  intervals  and  hypothesis  testing 
involving  parametric  functions  of  pixel  values.  The  problem  of  ill 
conditioning  is  examined  for  atmospheric  turbulence  and  diffraction 
limited  spread  functions.  The  results  of  the  restoration  of  simu- 
lated pictures  under  separable  spread  functions  are  presented. 

In  order  to  solve  the  ill  conditioning  of  the  restoration 
problem,  a priori  information  in  the  form  of  deterministic  con- 
straints is  proposed.  A comparison  with  existing  methods  like 
Wiener  filter,  smoothing  and  regularizing  techniques  is  made. 
Linear  equality  constraints  reduce  the  variance  of  the  estimators, 
but  some  bias  may  be  introduced  if  the  constraints  are  not  valid. 


iii 


A combination  of  estimation  and  hypothesis  testing  is  proposed  to 
decide  if  a reduction  of  the  mean  square  error  (taking  into  account 
both  bias  and  variance)  occurs.  Experimental  results  show  that 
more  acceptable  restored  pictures  are  obtained  in  the  restoration. 

Linear  inequality  constraints  are  incorporated  by  means  of 
a quadratic  programming  formulation.  The  natural  constraint  of 
nonnegativeness  of  pixel  values  is  handled  in  a formal  vay,  as  well 
as  other  types  of  restrictions  that  can  be  described  by  linear  in- 
equalities. Experimental  results  indicate  a substantial  improvement 
in  the  restoration  even  for  the  ill  conditioned  situation. 
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1.  INTRODUCTION 


The  subject  of  image  restoration,  encompassing  attempts  to 
remove  different  types  of  degradations  in  imaging  systems,  dates 
back  to  the  fifties  [l-l].  However,  it  was  the  space  program  of  the 
sixties,  with  its  need  for  high  quality  imagery,  that  provided  the 
necessary  motivation  for  the  development  of  the  field.  In  particular, 
the  work  developed  at  the  Jet  Propulsion  Laboratory  [1-2]  demon- 
strated the  feasibility  of  using  the  digital  computer  to  deal  with  the 
large  quantities  of  pictorial  data  involved.  The  success  of  the  effort 
opened  the  path  for  new  applications  that  now  range  the  large  spec- 
trum  of  biological  [l-3]  and  geological  sciences  [l-4],  high  energy 
physics  [1-5],  etc. 

Image  restoration  or  spatial  filtering  can  be  divided  into  two 
main  classes:  optical  and  digital  processing.  The  former  has  the 
advantages  of  larger  storage  capacity  and  faster  processing,  but 
does  not  achieve  the  precision  and  flexibility  of  the  latter.  This 
dissertation  will  be  concerned  with  digital  methods  for  image 

restoration,  with  emphasis  on  a firm  theoretical  basis  in  their  deri- 
vation . 

The  degradations  that  an  imaging  system  imposes  over  a 
picture  can  often  be  roughly  described  as  composed  by  a smoothing 


1 


2 


I 


I! 

[ 


V 

• J 


operation  due  to  the  finite  resolution  of  the  sensor  and  the  addition 
of  disturbances,  known  only  in  a statistical  sense.  The  earlier 
methods  of  restoration,  mostly  optically  oriented,  attempted  to  undo 
the  first  degradation  by  inverse  filtering  [1-6].  These  techniques 
used  the  Fourier  transforming  properties  of  lenses,  by  simply 
multiplying  the  Fourier  transform  of  the  object  by  the  inverse  of  the 
Fourier  transform  of  the  blurring  function.  The  presence  of  sta- 
tistical noise  corrupting  the  image  was  disregarded  and  this  fact 
often  limited  the  effectiveness  of  these  methods.  A nonoptimal  pro- 
cedure [1-7]  consisted  of  replacing  the  inverse  Fourier  transform  of 
the  blur  function  by  zero  in  the  spatial  frequencies  where  the  noise  is 
larger  than  the  signal. 

Perhaps  the  first  attempt  to  consider  a formal  way  to  deal 
with  the  presence  of  noise  in  an  image  is  due  to  Helstrom  [1-8]. 

The  image  and  noise  were  regarded  as  uncorrelated  random  pro- 
cesses with  a known  blur  function.  Slepian  [1-9]  considered  the 
lack  of  knowledge  of  the  blur  function,  and  also  modeled  it  as  a 
random  process.  Experiments  [1-10  and  1-11]  indicated  that  formal 

approaches  using  the  mean  square  error  criterion  gave  better  results 
than  ad  hoc  schemes. 

Digital  methods  for  image  restoration  have  had  to  face  the 
problems  of  storage  and  computational  time  in  dealing  with  large 
scale  sampled  images.  Some  of  the  methods  developed  have  utilized 
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simple,  ad  hoc  operations  while  others  [ 1-12,  1-13  and  1-143  have 
attempted  more  formal  approaches  based  on  mean  square  error.  In 
these  cases  the  Bayesian  approach  has  predominated,  with  the 
modeling  of  the  object  as  a two-dimensional  random  process. 

In  this  dissertation  a different  direction  is  taken.  In  many 
situations  the  experimenter  faces  the  restoration  task  with  very  little 
or  even  no  a priori  knowledge  about  the  object  to  be  restored.  In 
such  cases  the  use  of  the  Bayesian  approach  does  not  seem  to  offer 
the  best  alternative.  When  no  a priori  knowledge  about  the  image  is 
assumed,  a regression  model  adequately  describes  the  blurring  and 
addition  of  noise  processes.  The  original  object  is  simply  considered 
as  a set  of  parameters  to  be  determined,  given  the  knowledge  of  the 
blurred  and  noisy  image,  the  blurring  function  and  the  statistics  of 
the  noise.  The  necessity  of  digital  processing  requires  a discrete 
modeling  of  both  the  object  and  the  image. 

The  use  of  the  least  squares  criterion  leads  to  a very  tract- 
able and  general  mathematical  structure,  allowing  the  image  restora- 
tion process  to  be  cast  in  a technique  analogous  to  those  used  in  the 
field  of  econometrics,  for  example.  However,  the  lack  of  use  of  any 
a priori  knowledge  limits  the  effectiveness  of  the  restoration  process. 
It  will  be  shown  that  for  certain  amounts  of  blur  the  estimators  have 
very  large  variance,  masking  completely  the  real  content  of  the 
image . 


4 


The  model  used  is  flexible  enough  to  accommodate  some  a 
priori  information,  including  the  Bayesian  approach.  Since  this 
path  has  been  considerably  explored  in  the  past,  a new  approach  was 
pursued,  namely,  the  use  of  deterministic  constraints. 

Linear  equality  constraints  allow  a reduction  in  variance,  as 
a result  of  a reduction  in  the  dimensionality  of  the  problem.  The 
detection  of  any  bias  imposed  as  the  result  of  incorrectly  formulated 
constraints  is  also  discussed. 

The  problem  of  taking  into  account  some  physical  inequality 
constraints  that  should  be  satisfied  by  estimators  has  been  the  ob- 
ject of  discussion  by  several  authors.  The  most  obvious  restriction 
to  be  satisfied  in  image  restoration  is  nonnegativeness.  It  comes 
from  the  basic  physical  laws  governing  the  process  of  image  for- 
mation. Some  results  [1-15]  concerning  the  properties  of  Fourier 
transforms  of  nonnegative  functions  were  used  by  Lukosz  Tl-161 
to  give  bounds  on  the  transfer  function  of  a physical  system. 
Similarly,  Cleveland  and  Schell  [1-17]  extrapolated  the  spectrum  so 
that  it  would  become  an  autocorrelation  function,  imposing  that  its 
Fourier  transform  pair  be  nonnegative.  Phillip  [ 1- 18] considered 
the  problem  of  finding  the  maximum  likelihood  estimator  of  a con- 
tinuous function  assumed  to  be  nonnegative  and  upper  bounded, 
under  gaussian  noise.  A quadratic  expression  has  to  be  minimized 
under  these  constraints.  Necessary  and  sufficient  conditions  for 
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uniqueness  of  the  solution  were  derived  and  the  problem  was  ex- 
plicitly solved  in  some  special  cases.  Some  estimation  procedures 
can  Rive  only  nonnegative  results  as  a result  ci  an  exponentiation, 
for  example.  This  is  the  case  with  the  technique  of  homomorphic 
filtering  [1-19],  that  assumes  the  image  to  be  the  result  of  the  pro- 
duct of  an  illumination  and  a reflected  component.  The  assumption 
that  the  image  is  described  by  an  array  of  cells  whose  content  is 
given  by  the  Maxwell  Boltzmann  distribution  also  leads  to  estimators 
given  by  exponentials.  This  has  been  explored  by  Frieden  [1-20] 
and  Her shel  [1-21  and  1-22]  . Ad  hoc  procedures  have  also  been 
tried,  as  the  control  of  the  relaxation  factor  in  an  iterative  method 
to  solve  a linear  system  of  equations  [1-23].  Further  details  on 
these  proposed  methods  are  given  in  reference  [1-24]. 

This  dissertation  will  develop  the  inequality  constrained 
least  squares  approach  to  image  restoration.  The  proposed  method 
follows  a philosophy  similar  to  the  one  described  by  Phillip  [1-18] 
ior  the  case  of  the  discrete  model.  The  optimal  solution  is  given  by 
a quadratic  programming  procedure.  Any  kind  of  linear  inequality 
constraint  can  easily  be  incorporated  and,  as  a result,  requirements 
like  monotonicity  and  convexity  of  the  solutions  can  be  satisfied.  The 
statistical  analysis  of  the  estimators  is  considerably  more  complex 
than  the  previous  cases,  but  some  approximate  confidence  intervals 
for  functional  values  of  the  original  image  can  be  obtained.  Besides 


improving  the  quality  of  the  restoration  by  the  use  of  additional  a 
priori  information  in  the  statistical  procedure,  the  use  of  linear  in- 
equality constraints  in  the  form  of  lower  (nonnegativeness)  and  upper 
bounds  facilitates  the  display  of  the  pictorial  information. 

A word  about  notation  is  necessary.  An  attempt  has  been 
made  to  maintain  coherence  by  expressing  matrices  by  underlined 
capital  letters,  vectors  by  underlined  small  letters  and  scalars  by 
small  or  capital  nonunderlined  letters. 


2.  THE  RESTORATION  PROBLEM 
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This  chapter  presents  the  mathematical  framework  in  which 
the  restoration  problem  can  be  cast.  In  section  2.  1 the  modeling  of 
the  blurring  and  addition  of  noise  processes  is  discussed.  Section 
2.2  contains  a brief  discussion  of  the  properties  of  the  Fredholm 
integral  equation  of  the  first  kind.  Its  discretization  is  examined 
in  section  2.  3 and,  finally,  section  2.4  presents  the  several  numeri- 
cal methods  that  have  been  proposed  to  solve  this  equation. 

2.1  The  Model 

Figure  (2.  1-1)  contains  the  block  diagram  of  an  incoherent 
imaging  system.  The  first  source  of  degradation  is  represented  by 
the  point  spread  function  h(a,  S,  p , T]).  It  is  assumed  that  this  blur- 
ring operation  is  linear  so  that  it  can  be  represented  by  a linear  fil- 
tering operation.  The  second  source  of  degradation  represents  the 
addition  of  noise.  Due  to  the  randomness  inherent  in  this  process,  it 
can  only  be  characterized  in  statistical  terms.  Consequently,  due  to 
the  lack  of  complete  knowledge  of  the  degradation,  the  restoration 

cannot  be  perfect  in  the  sense  of  restoring  the  image  to  the  original 
value. 

Assuming  that  all  the  processes  involved  are  available  con- 
tinuously and  unboundedly,  the  following  equation  characterizes  this 
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two  dimensional  model 


+00+® 


3.  P)  = J ■n)h(a,  5;  p , n)d^dri  + n(a, 


< a,  p < ® 


(2.  1-1) 


In  many  situations,  the  input  image  is  available  only  over  a finite 
extent  and  the  previous  equation  reduces  to 


y(<*.  p)  = J J*U  » il)h(a,  ? ; p,  n)d5dn+n(a,  8) 


(2.  1-2) 


In  the  particular  situation  where  the  blur  function  is  isoplanatic,  the 
point  spread  is  a function  of  only  two  variables  and  the  previous 
equation  takes  the  form 


y(a,  p ) = J Jx(?.  Tl)h(a-  5 ; p-  T|)  d ?d  q+  n(a,  p ) 


< a,  p < 


(2.  1-3) 


This  model  is  general  enough  to  include  many  situations  that  occur 
in  optical  systems.  The  hypothesis  of  linear  and  spatially  invariant 
blur  is  valid  in  situations  like  limitation  due  to  diffraction,  for 
example.  In  this  case,  the  blur  function  in  a rectangular  system 
assumes  the  general  form  [2-1]  . 
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h(a.  P)  = 


/ sin  a \ 

/ sin  £\ 

\ a ) 

v.  P ) 

(2.  1-4) 


A nice  feature  of  this  rectangular  system,  is  the  fact  that  the 
degradation  is  separable.  In  other  words,  this  function  of  two  vari- 
ables can  be  cast  into  the  product  of  two  functions  of  one  variable 
each.  Another  example  is  the  blurring  due  to  atmospheric  turbulence 
for  long  photographic  exposure,  in  which  the  point  spread  function  is 
of  the  form  [2-2]. 


h(ct,  p)  = exp  |" -(a2  +P2)5/6J  (2.1-5) 

Several  other  examples  could  be  mentioned.  The  defocussing  [2.  1-1] 
that  the  optical  system  may  impose  over  the  image  is  one  of  them. 
Other  examples  could  include  certain  types  of  optical  imperfections 
and  motion  blur  [2-3]. 

The  assumption  of  space  invariance  of  blur  cannot  be  vali- 
dated under  certain  circumstances.  Examples  of  this  are  motion 
blur  where  objects  at  different  distances  from  the  camera  move  by 
different  amounts  [2-3]  or  certain  optical  aberrations  like  coma, 
pincushion  and  barrel  distortion.  Although  most  of  the  experimental 
work  in  this  dissertation  will  concentrate  on  the  removing  of  spa- 
tially invariant  blur,  the  regression  model  will  not  be  restricted  to 
this  class  of  degradation. 
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The  assumptions  of  additive  noise  are  broad  enough  to  en- 
compass different  situations  in  which  the  limitations  of  the  optical 
and/or  electrical  system  impose  perturbations  known  only  in  a 
statistical  sense*  Stray  illuminatio  n,  circuit  noise,  or  round  off 
due  to  digital  processing  could  be  mentioned. 

Nevertheless,  as  it  should  be  expected,  there  are  restrictions 
in  the  use  of  the  present  model.  The  assumption  of  linearity,  for 
example,  is  subject  to  criticism,  since  ultimately  the  image  is  re- 
corded on  a photographic  medium  whose  characteristic  in  severely 
nonlinear  [2-1].  Even  though  this  nonlinear  function  is  known,  its 
effect  might  be  such  that  the  addition  of  noise  could  occur  before  and 
after  the  nonlinearity.  Such  would  be  the  case  with  stray  illumina- 
tion in  exposure,  followed  by  the  nonlinearity  of  the  H-D  curve,  fol- 
lowed by  roundoff  error  in  digital  processing  of  the  picture.  In  some 
circumstances,  however,  the  effect  of  the  nonlinearity  can  be  lumped 
in  one  block  after  the  addition  of  noise.  Therefore,  its  effect  can  be 
undone  by  an  inverse  operation  prior  to  any  other  operation. 

The  a - sumption  of  additive  noise  can  also  be  criticized.  In 
particulax,  the  effect  of  graininess  in  photographic  materials  is  far 
from  being  additive.  Huang  [2-4]  has  shown  that  it  could  be  modeled 
by  a multiplicative  process. 

Once  the  limitations  of  the  model  are  specified,  the  next  step 
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is  to  clarify  the  use  of  a priori  information  in  it.  First,  the  model 
assumes  that  the  analyst  has  complete  knowledge  of  the  blur  function. 
This  hypothesis  presupposes  that  the  experimenter  has  some  way  of 
measuring  the  modulation  transfer  function.  This  could  be  done  by 
measuring  the  system  itself,  [2-5],  by  theoretical  analysis  [2-1]  or 
by  measuring  the  response  of  a sharp  point  or  edge  in  the  picture 
[2-6,  2-7,  and  2-8]. 

With  respect  to  the  function  x(a.  P).  unless  explicitly  stated, 
it  will  be  assumed  throughout  this  dissertation,  that  it  is  a fixed  but 
unknown  function  to  be  determined,  given  the  values  of  the  output 
function  y(S,  r\).  This  implies  that,  although  the  observed  values 
y(5,  h)  are  random,  the  desired  function  x(a,  p)  is  not  a random  pro- 
cess. This  approach  of  parameter  identification  is  in  contrast  with 
the  Bayesian  approach  that  assumes  an  a priori  statistical  distri- 
bution on  x(a , P),  characterizing  it  as  a random  process.  The  first 
method  leads  itself  to  the  use  of  other  types  of  a priori  information, 
namely,  linear  relationships  involving  values  of:c(a,  p)  and  bounds 
on  their  values.  These  methods  will  be  extensively  explored  in  the 
present  work. 

As  far  as  the  noise  is  concerned,  all  the  methods  used  will 
assume  knowledge  of  the  second  order  statistical  properties.  It  will 
not  be  necessarily  white  although  this  assumption  will  often  be  made. 
If  additional  hypotheses  are  assumed,  further  inferences  will  be 


drawn. 


In  order  to  perform  some  meaningful  restoration,  it  is  neces- 
sary to  define  a goal  to  the  estimation  process.  The  purpose,  of 
course,  is  to  estimate  the  unknown  function  x(a,  3)  , given  the  ob- 
served values  y(?  , h)  , for  some  criterion  of  goodness  of  the  restored 
image.  Assuming  that  the  picture  is  to  be  viewed  by  a human  ob- 
server, the  criterion  should  take  into  account  the  psychophysical 
properties  of  human  vision.  Much  research  is  needed  in  this  field 
so  that  reasonable  criteria,  both  from  the  point  of  view  of  realism 
and  mathematical  tractauility,  could  be  obtained.  In  the  lack  of  a 
better  one,  a squared  error  criterion  will  be  adopted,  namely,  mini- 
mizing the  covariance  between  the  estimated  values  and  true  values. 
Although  it  is  known  that  the  human  observer  does  not  judge  images 
according  to  this  criterion  [2-9],  it  has  been  found  (and  our  experi- 
mental work  tends  to  confirm  this)  that  reasonable  results  can  be 
obtained  by  its  use.  Furthermore,  and  here  is  its  main  advantage, 
the  use  of  a squared  error  leads  to  a very  tractable  mathematical 
structure,  the  regression  model,  that  has  bjen  considerably  explored 
in  mathematical  statistics  and  econometrics. 

2.2  The  Fredholm  Integral  Equation  of  the  First  Kind 

The  problem  of  restoration,  as  stated  in  equations  (2.  1-2)  or 
(2.1-3)  consists  in  solving  a two  dimensional  version  of  the  Fred- 
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holm  equation  of  the  first  kind.  The  same  type  of  integral  equation 
occurs  in  different  physical  problems  as  radioastronomy  [2-10  and 
2-11],  spectroscopy  [2-12],  applied  optics  [2-13],  communication 
theory  [2-14]  and  nuclear  engineering  [2-15]. 

The  ideal  kernel  would  be 

h(a.  ? ; p.  ti)  = p-rj)  (2.2-1) 

since  in  this  case,  with  no  noise 
b b 

y(a,  p)  = J Jx(5,Tj)Ma-5,  p -T))dSdii  = x(a,  p)  (2.2-2) 

a a 

When  the  kernel  is  not  the  6 function,  there  is  a loss  of  resolution 
and  the  problem  that  is  posed  is  the  one  of  recovering  values  of 
x(5  , T))  given  the  values  of  y(  <x,  p)- 

In  order  to  keep  the  equations  in  their  simplest  form,  only 
the  one  dimensional  blur  will  be  considered  in  the  following  para- 
graphs. The  extension  to  planar  equations  is  straightforward. 

Under  this  condition,  under  no  noise,  equation  (2.  1-2)  assumes  the 
form 

b 

y(a)  = Jx(^)h(a.  S)d5  (2.2-3) 

a 

where  the  function  h(a,  5)  is  the  so-called  kernel  of  the  integral 
equation.  Associated  with  this  kernel  there  is  an  eigenvalue  - 


IS 


eigenfunction  problem  defined  by  the  equation 
b 

Jh(a,  S)i(5)d?  = U(a)  (2.2-4) 

a 

The  so-called  spectrum  of  the  kernel,  i.e.,  the  distribution 
of  its  eigenvalues,  determines  the  most  important  properties  of  the 
solution  x(§)  for  a given  observed  value  y(Ct).  For  example,  the 
existence  of  zero  eigenvalues  expressed  by  the  equation 

b 

J h(a,  ?)|(5)d5  = 0 (2.2-5) 

a 

implies  that  the  solution  to  equation  (2.2-3)  will  not  be  unique  because 
a linear  combination  of  eigenfunctions  corresponding  to  zero  eigen- 
values can  always  be  added  to  the  solution  and  the  result  will  still  be 
a solution. 

A real  kernel  h(a,  5)  is  symmetric  if  h(a,  5)  = h(S,  a)»  The 
eigenvalues  of  a symmetric  kernel  are  real  and  eigenfunctions  cor- 
responding to  different  eigenvalues  are  orthogonal,  that  is 

b 

J\(5)f.(5)d5  = 0 (2.2-6) 

a 

X.  4 

i 3 

Furthermore,  the  eigenfunctions  corresponding  to  the  same  eigen- 
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value  span  a linear  subspace.  In  this  subspace  an  orthogonal  basis 
can  be  selected  (say,  by  using  a Gram-Schmidt  procedure)  so  that  it 
is  always  possible  to  have  an  orthonormal  set  (as  a result  of  further 
normalization)  of  eigenfunctions  for  a symmetric  kernel. 

The  kernel  is  defined  to  be  closed  if  it  does  not  have  any  zero 
eigenvalues.  As  a result,  the  solution  to  equation  (2.2-3)  will  be 
unique.  The  kernel  is  said  to  be  separable  if  it  can  be  expressed  as 
the  sum 

N 

h(a,  ?)  = £ fn<a)gn(?)  (2.2-7) 

n=l 

where  N is  finite  and  the  functions  f^B.),  f2(a),  **  * ’ are  linearly 

independent  in  [a,  b].  If  the  kernel  is  separable,  equation  (2.2-3) 
will  have  a solution  only  if  y(a)  is  a linear  combination  of  f (a), 
f^fct),  • • • fN<a)* 

Let  X.,  X_,  • • • , \n,  • « •,  in  order  of  decreasing  absolute  value, 

1 Ct 

be  the  eigenvalues  of  the  real  symmetric  kernel  and  let  ^(a), 

$ (a),  . ..,  $ (a),  •••be  the  corresponding  eigenfunctions  that  are 
2 n 

assumed  to  form  an  orthonormal  set.  It  can  be  shown  that  this 
kernel  can  be  expressed  as 

h(a,  S)  = £ X * (a)0  (5)  (2.2-8) 

n n n 

n=  1 

if  the  series  converges  uniformly. 
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Under  the  conditions  that  a kernel  be  symmetric  and  closed, 
the  set  of  orthonormal  eigenfunctions  forms  a complete  set,  i.e., 
any  function  in  the  space  can  be  expressed  as  a tinear  combination 
of  the  elements  of  this  set.  Consequently,  t.ie  observed  value  y(a) 
can  also  be  expressed  in  this  way 


y(a)  = £ a * (ot)  (2  2-9) 

, n n 
n=  1 


where  the  coefficients  a are  given  by 


b 

an  = J y(a)  ?n(a)da 

a 


(2.  2-10) 


In  this  case  a necessary  and  sufficient  condition  for  equation  (2.2-3) 
to  have  a solution  is  that  the  series 


£ 
n=  1 


(2.2-11) 


converges.  In  case  of  convergence,  the  solution  is  given  by 

00  a 

x(a)  = £ T22-  $ (a)  (2.2-12) 

_ i A n 
n-1  n 

2.  3 The  Discretization  of  the  Integral  Equation 

When  the  Fredholm  integral  equation  is  to  be  solved  in  the 
digital  computer,  a discretization  has  to  be  performed.  This  takes 
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into  consideration  the  fact  that  when  images  are  processed  digitally, 
the  information  is  necessarily  finite  and  discrete.  Therefore,  sup- 
pose that  y(a.  p)  . the  observed  function,  is  sampled  at  a finite  set 

of  points 


y^,  8^)  i = 1,  2,  • • • , I 

j = 1,  2,  • • • , J 

This  implies  that 
bb 

y(a,.  P-)  = JI*<5  , n)h(a.,  ?,  Py  Tj)d?dn  + nfa.j,  p.) 

J a a 


(2. 3-1) 


(2.  3-2) 


In  order  to  reduce  the  problem  to  a complete  discrete  form,  numeri- 
cal quadrature  expressions  must  be  used,  replacing  the  integral  by  a 
weighted  sum  of  the  values  of  the  integrand  at  points 


x(?k*  ^ k = 1,  2,  • • • » K (2.3-3) 

i - 1,  2,  • • • , L 

Under  these  conditions,  one  obtains  the  following  expression 

K L 

y«V  = wk(h<“v  ?k  i V VX(V  V + n(ar  pj’ 

J k=U=l 

i = 1,  2,  ...  , I (2-  3“4) 

j = 1,  2,  » » » , J 
k = 1,  2,  « • • , K 
l - 1,  2,  • • • , L 
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where  0.(0.)  and  I (I.)  represent,  respectively,  the  (KxK)  ((Ixl)) 

“J  J 

matrix  with  all  zero  elements  and  the  (KxK)  ((Ixl))  identity  matrix. 

Using  this  notation,  the  vector  representations  of  the  matri- 
ces X and  Y are  given  by 


x = £ N X v 
- (=1— 1 !—< 


(2.  3-9) 


and 


j=l  J ^ 


(2.  3-10) 


where  x and  ^ are  (K.Lx  1)  and  (I.  Jx  1)  vectors,  respectively.  The 
purpose  of  the  vector  v is  to  extract  the  column  from  X.  The 
matrix  N has  the  rcle  of  placing  this  column  into  the  segment  of 

X/ 

the  (K.Lx  1)  vector  x.  As  a result,  x contains  the  elements  of  X 
scanned  column-wise.  Analogous  considerations  can  be  made  for  the 
vector  and  the  matrix  Y. 

At  this  point,  it  is  also  convenient  to  refer  to  the  inverse 
relation,  that  allows  the  transformation  from  the  vector  form  back 
into  the  two-dimensional  format.  This  manipulation  will  be  useful 
in  transforming  blurred  and  restored  images  into  two-dimensional 
form  for  display  purposes. 


L t 'p 

X = yN  xv 

1=  1 1 1 


(2.  3-11) 


(2. 3-12) 


X = E Mj  XV 
j=i  3 3 


With  this  column  scanning  of  the  two  dimensional  data 
arrays,  the  system  of  equations  assumes  the  form 


jr  = Hx  + n (2.  3- 13) 

X = (I.  Jx  1)  vector 
H = (I.  JxK.  L)  matrix 
x = (K.  Lx  1)  vector 
n = (I.  Jx  1)  vector 

where 


and  the  matrix  His  given  by 


f 
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H = 


Hi.i  ' • -Hi.l 


H 


J.  1 


H 


■J,  L 


where  the  submatrices  H.  , have  the  form 

J.  * 


ft? 


H.  , 

-J.  * 


WUh<°r  51;  * * * WKJth(al’  ’ fy  11 


WHh(ai’  "l  ’ fy  V * * * WKJ?h(ar  ty  ’V 


r 


n 


I 


c< 


u. 

f 


* 
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r 


* 
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The  problem  of  image  restoration  has  now  been  reduced  to  a 
regression  framework,  that  can  be  stated  as  follows:  given  the  ob- 
served vector  Yj  the  blur  matrix  H and  the  second  order  statistics 
of  the  noise  vector  n,  estimate,  according  to  some  suitable  criterion, 
the  vector  of  parameters  x.  In  the  next  chapter,  this  regression 
problem  will  be  treated  extensively,  as  well  as  the  specific  questions 
arising  in  its  solution  in  the  context  of  image  processing. 

Furthermore,  by  the  use  of  additional  a priori  information, 
expressed  by  equality  or  inequality  constraints  on  the  restoration, 
the  problem  of  ill  conditioning  will  be  improved.  This  will  be  the 


object  of  discussion  in  chapter  4 of  this  dissertation. 
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The  problem  of  selecting  the  number  and  the  location  of  the 
nodes  of  the  quadrature  approximation,  as  well  as  the  observed 
values  is  a very  complex  one,  since  it  involves  several  different 
sources  of  error.  The  first  error  comes  from  the  approximation  of 
the  integral  by  the  summation  and  it  will  be  named  quadrature  error. 
It  tends  to  decrease  as  the  number  of  nodes  increases.  The  best 
location  of  the  nodes  is  not  given,  in  general,  by  the  equally  spaced 
distribution.  In  one  dimension,  if  the  nodes  are  located  on  the  zeros 
of  the  set  of  orthogonal  polynomials  on  the  interval  [a,b]  , the  so 
called  gaussian  quadrature  is  obtained  [2-17,  pages  392-395].  It 
provides  the  optimum  precision  in  the  sense  of  maximizing  the  de- 
gree of  the  polynomial  for  which  the  quadrature  is  strictly  correct, 
in  two  dimensions,  the  technique  of  gaussian  quadrature  cannot  be 
easily  generalized  [2-17,  page  419]  since  the  zeros  of  the  ortho- 
gonal polynomials  maybe  complex  or  lie  outside  the  region  of  inte- 
gration. 

Another  source  of  error  may  appear  when  the  continuous 
estimator  x(£,  is  obtained  from  the  discrete  vector  x.  Assume  that 
the  nonrandom  function  x(5 , r\)  is  band  limited  in  the  frequency  plane 
within,  for  example,  the  rectangular  region  given  by  the  coordinates 

-Bu’  +Bu  and  "Bv’  +Bv'  where  u and  v represent  the  coordinates  of 

the  frequency  domain.  If  the  sampling  grid  is  coarser  than  — — 

2B 


~ — (Nyquist  rate),  then  an  aliasing  error  will  occur  when  the  con- 
2B 

v 

tinuous  function  is  obtained  from  the  interpolated  values.  For  a 


given  interval,  this  requires  that  the  number  of  samples  be  above  a 


certain  threshold  if  an  equally  spaced  distribution  of  quadrature  nodes 


is  employed.  The  determination  of  the  threshold  will,  of  course,  de- 


pend on  the  a priori  knowledge  of  the  frequencies  and  B^. 


The  third  source  of  error  comes  from  the  noise  inherent  in 


the  observations  of  the  blurred  picture.  While  the  quadrature  error 


affects  the  process  of  passing  from  the  continuous  to  the  discrete 


description  and  the  aliasing  error  intervenes  in  the  inverse  process, 


the  effect  of  the  noise  is  over  the  estimation  of  the  discrete  values. 


It  becomes  worse  as  the  number  of  nodes  in  the  quadrature  formula 


increases.  It  can  be  measured  by  the  increased  condition  number 


or,  through  a complementary  point  of  view,  by  the  increased  vari- 


ance of  the  estimators,  as  will  be  discussed  in  chapter  3 of  this 


dissertation.  The  type  of  quadrature,  the  location  of  the  nodes  and 


of  the  observation  values  affect  the  blur  matrix  H and,  by  conse- 


quence, the  quality  of  the  estimators. 


If  the  cuadrature  error  can  be  disregarded  with  respect  to 


the  two  other  sources  of  error,  a trade-off  can  be  characterized 


between  aliasing  and  the  effect  of  noise.  A small  number  of  nodes 


implies  small  variances  of  the  estimators  but  possibly  an  aliasing 
error  in  the  reconstruction.  Increasing  the  number  of  nodes  tends 
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to  eliminate  the  aliasing  at  the  price  of  increased  variances  of  the 
discrete  estimators. 

The  number  of  observation  values  y(a»  , £L)  should  be  kept  at 
least  equal  to  the  number  of  nodes  in  the  quadrature  if  no  other  a 
priori  information  is  to  be  incorporated.  Otherwise  this  lack  of  in- 
formation will  be  reflected  in  infinite  variances  of  the  discrete  esti- 
mators. In  the  case  of  use  of  a priori  information,  a trade  off  can 
be  characterized  between  this  information  and  the  one  coming  from 
the  sample. 

This  dissertation  will  be  concerned  mainly  v/ith  the  third 
type  of  error,  namely,  the  one  d\ie  to  noise.  It  will  be  implicity 
assumed  that  the  sampling  is  enough  to  avoid  aliasing  errors  and 
that  quadrature  errors  are  negligible  compared  to  noise  errors. 


2.4  The  Existing  Methods  of  Solution 

Except  for  a few  cases,  the  solution  of  the  Fredholm  equation 
of  the  first  kind  is  far  from  trivial.  Usually  numerical  techniques 
are  used  for  its  solution.  All  methods  of  solution  have  to  face  the 
obstacle  of  the  ill  conditioning  of  the  problem.  This  means  that 
small  perturbations  on  the  observed  values  result  in  very  large 
changes  in  the  solution.  A large  research  effort  has  been  underway 
during  the  last  two  decades  attempting  to  develop  feasible  compu- 
tational methods  to  d^al  with  this  problem. 
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In  general,  these  methods  try  to  circumvent  the  problem  of 
ill  conditioning  by  imposing  side  constraints  on  the  solutions.  An 
example  is  the  method  by  Phillips  (2-18),  who  imposed  the  con- 
straint that  the  solution  be  smooth  by  minimizing  the  criterion 

b 

Min  C [x"(a)]2da  (2.4-1) 

x 

a 

where  x"  (a)  denotes  the  second  spatial  derivative  of  x (a).  If  a 
discretization  is  performed,  a linear  system  of  equations  is  obtained 

_jr  = Hx+C  (2.4-2) 

where  H is  a square  matrix.  In  Phillips'  method,  an  estimator  x 

is  forced  to  satisfy  a quadratic  equality  constraint  related  to  the 

2 

noise  level  involved  ( l ) 

(_jr  - Hx)T(^  - H x)  = -t2  (2.4-3) 


and  the  solution  is  obtained  by  minimizing  a quadratic  form 
measuring  the  smoothness  constraint 

T 

Minx  S x (2.4-4) 

x 

The  result  of  the  equality  constrained  optimization  problem  is 
given  by 
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S=[HtvrVsjV  (2.4-5) 

where  Y is  a Lagrange  Multiplier  that  specifies  the  amount  of  smo- 
othing imposed  by  the  constraints.  It  was  also  shown  that  an  esti- 
mator of  the  perturbation  vector  £ relates  to  the  Lagrange  Multiplier 
Y and  the  optimal  solution  through  the  expression 

1 - -YtH-Vsi  (2.4-6) 

A trial  and  error  method  was  used  giving  the  largest  value  of  the 

smoothing  coefficient  Y compatible  with  the  constraint  expressed  by 
equation  (2.4-3). 

A generalization  of  the  method  of  Phillips  as  well  as  a sim- 
plification was  performed  by  Twomey  £2-19  and  2-20].  The  solution 
expressed  by 

5 = (HTH  + YS|-IHTI  (2. 4-7) 

involves  only  one  matrix  inversion  instead  of  two  in  the  method  by 
Phillips.  In  addition,  the  met!  d allows  solutions  in  the  case  where 
the  matrix  H is  not  square.  It  should  be  observed  at  this  point  that 
for  S equal  to  the  identity  matrix,  Twomey 's  method  reduces  to  the 
so-called  method  of  ridge  regression  [2-21,  2-22,  and  2-23]  that 
attempts  to  trade  a small  amount  of  bias  in  the  statistical  procedures 
in  order  to  achieve  a major  reduction  in  the  variance  of  the  estimator. 


Among  the  other  methods  that  yield  numerically  stable  solu- 
tions to  equation  (2.2-3)  is  the  regularization  method  of  the  Russian 
mathematician  A.  N.  Tichonov  [2-24,  2-25,  and  2-26].  Likewise, 
the  method  imposes  the  constraint  that  the  solution  be  a piecewise 
smooth  function.  It  is  based  on  the  minimization  of  a functional 
which,  after  discretization,  assumes  the  form 

MY  [x,  = io(Hx -}l)T(Hx -£)  + YxT(--^S  + A SP'W 

) (2.4-8) 

where  S and  Pare  appropriate  positive  definite  matrices  that  define 
the  smoothness  constraint  and  A a and  A?  are  discrete  increments 
on  equation  (2.2-3).  The  solution,  for  any  Y > 0,  was  shown  to  be 
given  by 

xY  = A a £ AaHTH  + Y + A 5 H y_  (2.4-9) 

and,  again,  a trial  and  error  process  is  involved  in  the  determination 
of  the  optimal  value  of  the  coefficient  Y.  The  method  can  also  be 
generalized  so  that  the  functional  would  involve  higher  order  dif- 
ferences on  the  solution  vector. 

In  the  context  of  image  processing,  the  solution  of  the  planar 
integral  equations  involves  additional  difficulties  due  to  the  large 
dimensionality  required  when  a discretization  of  the  equations  is 
made.  Under  the  conditions  of  the  separability  of  the  matrix  H as  a 
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Kronecker  product,  Ekstrom  [2-27]  restructured  the  calculations 
using  a singular  value  decomposition  of  the  same  matrix. 

Other  methods  [2-28,  2-29,  2-30,  2-31,  2-32,  2-33,  2-34, 
and  2-35]  have  also  been  suggested  to  solve  the  deconvolution  prob- 
lem. In  general,  these  methods  were  proposed  to  solve  one  dimen- 
sional, small  dimensionality  problems  and,  as  pointed  out  by 
Ekstrom  [2-36],  some  sort  of  reformulation  of  the  problem  is  often 
needed  in  order  to  adapt  these  procedures  to  the  large  dimension- 
alities that  occur  in  two-dimensional  problems. 

A significant  development  is  possible  in  the  solution  involving 
large  amounts  of  data,  when  the  kernel  h(a,  §,  p,  t))  is  shift  invariant, 
that  is, 

h(cu  S,p,r|)  = h(a-p,§-ri)  (2.4-10) 

and  the  functional  that  expresses  the  smoothness  constraint  is  given 
by  a convolution  expression.  In  this  case  (H^H)  and  S are  Toeplitz 
matrices  in  the  one  dimensional  case  and  block  Toeplitz  in  the  two 
dimensional  case.  By  extending  the  domain  of  the  convolutions  and 
transforming  them  into  circular  operations.  Hunt  [2-37,  2-38, 

2-39,  and  2-40]  used  Fast  Fourier  Transform  techniques  to  solve 
Twomey's  method. 


3.  REGRESSION  TECHNIQUES 


In  the  previous  chapter  a blurred  digital  picture  corrupted  by 
noise  was  modeled  by  the  expression 

y_  - H x + n (3-1) 

where 

- (lx  J)  x 1 vector 
H = (lx  J)  x (Kx  L)  matrix 
x = (KxL)  x 1 vector 
n = (lx  J)  x 1 vector 

In  this  discrete  form,  the  problem  consists  of  performing  an  esti- 
mation of  the  parameter  vector  x,  given  the  observed  vector  the 
knowledge  of  the  matrix  H and  the  statistical  distribution  of  the  noise 
vector  n. 

In  order  to  proceed  with  the  derivation  of  the  solution  and  its 
properties,  it  is  necessary  to  consider  the  possible  dimensions  in- 
volved in  the  model.  For  the  sake  of  simplification, 

I x J = M (3-2) 

Kx  L = N (3-3) 

Two  cases  are  possible:  M s N and  M < N.  In  the  first  case,  which 
would  occur  if,  for  example,  I 2K  and  J s L,  the  number  of  nodes  in 
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the  quadrature  expression  is  less  than  or  equal  to  the  number  of 
samples  of  the  observed  image.  In  the  second  case  the  opposite 
situation  occurs.  The  latter  model  would  tend  to  occur  when  the 
experimenter  increases  the  number  of  nodes  in  order  to  improve  the 
discrete  approximation  of  the  integral  equation  that  represents  the 
blurring  process.  In  the  case  for  which  M 2:  N,  depending  on  the 
values  of  the  H matrix,  its  rank  may  or  may  not  be  given  by  the 
number  of  columns,  while  in  the  case  for  which  M < N,  the  rank  is 
necessarily  less  than  the  number  of  columns  of  H.  As  a matter  of 
notation,  the  model  of  full  column  rank  is  called  overdetermined. 

If  the  matrix  H is  not  of  full  column  rank,  the  model  is  said  to  be 
unde  rde  ter  mined.  The  overdetermined  model  leads  to  the  use  of 
classical  regression  techniques  for  its  solution,  while  the  underde- 
termined scheme  will  require  the  concept  of  pseudoinverse  and  ex- 
tensions of  the  previous  case. 

3.  1 The  Overdetermined  Model 

Consider  the  overdetermined  model,  i.e.,  under  the  condi- 
tions of  rank  of  the  matrix  H being  determined  by  the  number  of 
columns.  Suppose,  furthermore,  that  the  noise  has  zero  mean  and 
covariance  matrix  V_,  assumed  to  be  positive  definite.  The  vector 
x is  fixed  but  unknown  and  the  task  is  to  obtain  an  estimator  x of  x 
according  to  some  criterion.  The  chosen  estimator  is  the  best 
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linear  unbiased  estimator  (B.L.U.E.)  of  x.  This  means  that  one  is 
searching  for  an  estimate 

x = G y (3.  1-1) 

such  that 

E(G^)  = x (3.1-2) 

Let  Va  denote  the  covariance  matrix  of  the  optimal  estimator  vector 
— x 

x and  V_  the  covariance  matrix  of  any  other  linear  estimator  that 
— ’ -x 

satisfies  (3.  1-2).  It  is  noted  that  (V_  -V*)isa  positive  semi- 

X x 

definite  matrix.  The  optimal  solution  is  given  by  the  Gauss-Markov 
Theorem  [3-1,  page  52] 

x = (HT  V'1  H)”1  HT  V"1  x (3.1-3) 

and  its  covariance  matrix  is 

V*  = (HT  V_1  H)-1  (3.  1-4) 

_x  — “ 

Suppose  now  that,  instead  of  trying  to  estimate  the  set  of 
pixel  values  x^  i = 1,  2...  N,  one  is  interested  in  estimating  a 
linear  functional  of  the  x^s.  An  example  could  be  the  estimation  of 
the  integral  of  the  original  picture  that  would  be  observed  by  the 
output  of  a photocell.  The  linear  functional  § can  be  represented  by 


the  inner  product 
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T 

5 = £ x (3.  1-5) 

The  task  then  is  to  obtain  a B . L.  U.  E.  estimator  for  $ . 

Using  Lagrangean  methods  [3-2,  page  33}  it  is  possible  to  show  that 

T T -1  -1  t - 1 

$ = c (H  V H)  H V x (3.  1-6) 

This  means  that 

| = £T  x (3. 1-7) 

The  optimal  estimator  of  x could  also  be  derived  by  considering 
parametric  functions 

x i = 1,  2....N  (3.  1-8) 

where  e^  is  the  i—  column  of  the  identity  matrix.  In  this  case  x will 
be  formed  by  the  set  of  B.L.U.E.  estimators  for  each  one  of  its 
components. 

The  same  result  could  have  been  obtained  by  another  method, 
namely,  the  one  that  minimizes  the  weighted  sum  of  squares  of  the 
residuals.  This  is  the  method  of  least  squares,  which  was  first 
developed  by  Gauss.  In  this  case  one  seeks  for  the  vector  x that 
minimizes  the  quadratic  expression 

0 (x)  = - Hx)T  V"  1 (x  - Hx) 


(3.1-9) 
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Taking  derivatives  and  equating  to  zero,  one  obtains 


T - 1 T - 1 

■2H  V J_  + 2H  V H x 


= 0 


(3.  1-10) 


or 


T - 1 T _ 1 

H V Hx  = H V 


(3.  1-11) 


This  is  the  set  of  normal  equations  of  the  least  squares  prob- 
lem. Under  the  hypothesis  of  full  column  rank  of  the  blur  matrix  H 

and  positive  definiteness  of  the  covariance  matrix  V,  the  matrix 
T - 1 

(EL  Y.  ED  ^ Invertible  and  the  set  of  normal  equations  has  a unique 
solution  given  by 

£ = (HT  V'1H)"1HT  v'1  (3.1-12) 

A comparison  of  equations  (3.  1-3)  and  (3.  1-12)  will  confirm  the 
assertion  that  the  B.L.U.E.  and  least  squares  estimators  of  x are 
identical. 

When  the  noise  is  white,  V becomes  an  identity  matrix  and 
expression  (3. 1-12)  reduces  to 


or 


Let 


T IT 

x = (H  H)  H £ 

A + 

x = H £ 

H+  = (HTH)_1HT 


(3.  1-13) 


(3.  1-14) 


(3.  1-15) 
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The  matrix  H is  called  the  pseudoinverse  or  Moore-Penrose  gen- 
eralized inverse  of  11  [3-3].  A more  complete  discussion  of  its 
properties  will  be  given  in  connection  with  the  discussion  of  the 
unde  rde  ter  mined  model. 

Assume  for  the  moment  that  the  noise  is  white.  Therefore, 
the  least  squares  problem  reduces  to  the  minimization  of  the  square 
of  the  norm  of  the  residual  vector 

||  i - iJi  f (3.1-16) 

In  order  to  obtain  greater  understanding  over  the  question  of 
existence  and  uniqueness  of  the  restoration  problem,  some  heuristic 
arguments  will  be  presented.  Consider  Figure  (3.  1-1)  where  the 
decomposition  of  a finite  dimensional  linear  space  into  the  direct 
sum  of  two  linear  subspaces  is  represented  [3-3]  , namely, 

^ = RHT  + NH  (3.1-17) 

and 

EM  = RH  + Nht  (3.  1-18) 

N 

As  x varies  over  E , the  vector  ^ = Hx  varies  over  R(H). 

2 

Therefore,  the  problem  of  minimizing  j|  £ - Hx  J|  over  x can  be 

2 

reduced  to  the  one  of  minimizing  |J  X,  - j|  where  ^ is  in  R^. 
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i 


i 

L 

H. 


From  geometrical  considerations  it  is  clear  that  this  is  solved  for 
^ given  by  the  projection  of  ^ onto  R^.  Since  ^ is  in  R^,  there 
always  exists  a solution  to  the  problem 


H*  = 


(3.  1-19) 


which  implies  that  a solution  to  the  least  squares  problem  always 
exists.  Now,  the  solution  will  be  unique  if  and  only  if  the  null  space 
of  H,  N , is  composed  only  of  the  zero  vector.  Indeed,  assume  the 
solution  is  unique.  Therefore,  the  null  space  of  H has  to  contain 
only  the  zero  vector  because  otherwise  a nonzero  vector  in  could 
always  be  added  to  x without  affecting  On  the  other  hand,  assume 
that  N comprises  only  the  zero  vector.  If  the  solution  is  not  unique, 
say  x'  and  x"  being  two  distinct  solutions,  then  x'  - x"  would  be  in 

N , which  is  a contradiction. 

H 

In  the  overdetermined  case  the  columns  of  the  blur  matrix  H 
are  assumed  to  be  linearly  independent,  which  implies  that  the  null 
space  of  H contains  only  the  zero  vector,  otherwise  there  would  be  a 
nontrivial  linear  combination  of  these  column  vectors  resulting  in  the 
zero  vector.  This  explains  the  unique  solution  that  was  obtained  for 
the  normal  equations.  In  the  underdetermined  case  this  will  not  hap- 
pen and  there  will  be  many  solutions  to  the  least  squares  problem. 
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3.2  The  Hypothesis  of  Normality  and  Interval  Estimation 

it  will  be  assumed  in  this  section  that  the  noise  is  Gaussian. 
Besides  the  fact  that  it  occurs  often  in  practice,  this  assumption  also 
has  the  advantage  that  it  will  allow  the  derivation  of  further  prop- 
erties of  the  estimators. 

Accordingly,  let  the  components  of  the  noise  vector  n,  n^, 

n , •••  , n.  .,  be  jointly  distributed  with  a multivariate  normal  distri- 
2 M 

bution 

n ~ N(0  , V)  (3.2-1) 

denoting  that  the  mean  is  the  zero  vector  and  the  covariance  matrix 
is  V . Therefore,  given  the  parameter  vector  x,  the  probability 
density  function  of  the  observed  vector_yis  given  by 

p(y|x)  = TJJZ 7 exp  {-i(y  -Hx)T  V 1 (y  -Hx)l  (3.  2-2) 

(2tt)M/^  |V|t 

Consider  now  the  maximum  likelihood  estimator  of  the  vector  of  the 
original  pixel  values  x.  By  definition  [3-4,  page  193]  this  estimator 
is  obtained  by  maximizing  over  x the  expression  of  p(y|x).  One  may 
take  log  before  maximizing  since  it  is  a monotonically  nondecreasing 
function.  In  doing  this  one  observes  that  the  maximum  likelihood 
estimator  x minimizes  the  quadratic  expression 


(y_  - Hx)T  V" 1 (y_  - Hx) 


(3.  2-3) 


Under  these  conditions,  given  the  observed  values  of  the 
blurred  and  noisy  picture,  the  maximum  likelihood  estimator  of  the 
original  pixel  values  the  least  squares  estimator  (and  the 

i 

t 

B.L.U.E.  estimator  for  the  overdetermired  model),  if  the  hypothe- 
sis of  normalitv  is  assumed.  Since  the  maximum  likelihood  estima- 
tor has  che  desirable  properties  of  consistency  and  asymptotic  ef- 
ficiency, the  Gaussian  hypothesis  allows  the  extension  of  these 
properties  to  the  estimators  derived  under  the  two  other  criteria. 

In  the  following  discussion  the  assumption  of  white  noise  will 
be  made.  The  purpose  will  be  to  derive  estimators  for  the  variance 
of  the  noise  that  corrupts  the  image.  Under  the  white  noise  hypothe- 
sis, (3.2-2)  assumes  the  form 


P&|x)  - m7F”m  ’ exp(  — (£-if£)T(Z-Hx))  (3.2-4) 

(2tt)  a Vo  2 / 


If  the  log  likelihood  function  is  maximized  by  setting  the  derivative 
with  respect  to  n equal  to  zero,  one  obtains 


(l-Hx)T(}r-Hx)  - = 0 

2 a 2a 


(3.2-5) 


W 


The  expression  for  the  maximum  likelihood  estimator  of  the 
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coefficient  vector  x has  already  been  derived.  By  substituting  this 

2 2 

value,  the  maximum  likelihood  estimator  for  o , a , is  obtained 

&2  = (3.2-6) 

Now,  consider  the  quantity 

v = x-£  = X -Hx  = 

T -1  T 

= l-H(H  H)  H £ = 

T -IT 

= [I_-H(HH)  H ]jr 
= Li  (3.2-7) 

Since  _L  is  the  difference  of  two  symmetric  matrices,  it  follows  that 
Ij  is  also  a symmetric  matrix.  Also  I,  is  idempotent  as  it  can  be 
shown  by  the  following  derivation 

2 T-1T?  T-1T  T-lT 

L = [I_-H(H  H)  H ] = I-H(H  H)  H -H(H  H)  H 

T -1  T T -IT 
+ H(H  H)  H H(H  H)  H 

= J_-H(HTH)_1  HT  =L  (3.2-8) 

Furthermore,  the  trace  of  L can  be  obtained  as  follows 

tr Li  = trl_  - trH (HTH)  HT  = M-tr(HTH) ' 1 _HTR  = M-N  (3.  2-9) 
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From  the  fact  that  the  rank  of  an  idempotent  matrix  is  equal  to  its 
trace,  it  follows  that  the  rank  of  is  (M-N).  Also,  observe  that 

LH  = 1 HT]H  = _H-H  = 0 (3.2-10) 

Now,  consider  another  possible  estimator  foi  the  parameter 
0^,  namely 


a2  ■ 

(3.2-11) 

1 aT  . 

= — v V 

M-N  - - 

The  following  relationship  is  valid 

v = L v = L(H  x + n)  = L n (3.  2-12) 

where  the  fact  that  LH  = 0_  was  used.  Therefore, 

vTv  = nTLT  Ln  = nT  Ln  = trLnnT  (3.2-13) 

The  second  equality  comes  from  the  idempotency  of  L and  the  third 

nr 

is  based  on  the  fact  that  n"  Ln  is  a scalar  and  therefore  equal  to  its 
own  trace. 


By  taking  expectations  one  obtains 


42 


T.  2 


E(v  v)  = E(trLnn  ) = tr  LE(nn  )=a  tr  LI_  = a (M-N)  (3.2-14) 


2 

As  a result  of  the  last  expression,  s is  an  unbiased  estimator  for 

the  variance  of  the  noise.  Observe  that,  although  vTv  is  the  maxi- 

M — — 

2 

mum  likelihood  estimator  of  cr  , it  is  not  an  unbiased  estimator. 

It  could  be  also  of  interest  to  determine  an  estimator  for  the 

covariance  matrix  of  the  estimator  x,  namely,  a2  (HTH)-1.  Since 

s is  an  unbiased  estimator  for  02,  it  follows  that  s2(HTH)-1  is  an 

unbiased  estimator  of  o (H^H)-^. 

It  has  already  been  observed  that,  under  the  normality 

assumption,  the  vector  of  estimated  pixel  values,  x,  is  distributed 

according  to  a multivariate  normal  distribution.  Observe,  further - 
2 T T 

more,  that  (M-N)s  = L y = n L,n  and  that  _L  is  an  idempotent 

matrix  of  rank  (M-N).  This  fact  implies  that  the  quadratic  form 
(M-N)s2  . v 2 

2 has  a X distribution  with  (M-N)  degrees  of  freedom 
a 

[3-5,  page  91]. 

Now,  observe  that  the  matrix  L and  the  matrix  (HTH)_1HT  of 
x = (H  H)  satisfy 


(H TH ) " 1 HT  L = (HTH ) " 1 H T [ I-H  (H  TH ) " 1 HT  ] = (H  TH ) " 1 HT 

- (hth)' 1 ht  = 0 


(3.2-15) 


A 2 

This  implies  £3-6,  page  89]  that  x and  s are  independently  dis- 
tributed. 

Let  us  reconsider  now,  still  under  the  Gaussian  hypothesis, 
the  problem  of  estimating  a linear  functional  of  the  pixel  values  of  a 
picture,  like  the  sum  of  the  pixels  or  a single  pixel  value,  for 
example.  Expression  (3.  1-6)  gives  the  value  of  the  B.  L.U.E.  esti- 

A 

mator  f . That  expression  can  be  put  into  the  form 


where 


f 


T 

H X 


T 

u 


T T - 1 - 1 T 

= £ (H1  V JH)  1Ili 


(3.  2-16) 


(3.  2-17) 


Since  n is  normally  distributed  with  zero  mean  and  covari- 
ance matrix  V,  it  follows  that  ^ is  also  normally  distributed  with 
mean  H x and  covariance  V.  On  the  other  hand,  §,  being  a linear 
combination  of  Gaussian  random  variables,  is  also  a Gaussian 
random  variable  and  its  variance  is  expressed  by 

A X 

var  ( $ ) = u Vu  (3.2-18) 

A 

Sirce  I is  an  unbiased  estimator  of  § , the  random  variable 


*n 


^ var(f ) 


(3.2-19) 


is  zero  mean,  unit  variance  and  Gaussian.  As  a consequence,  the 
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probability  that  ft  falls  in  the  interval  f-K.K’'  is  given  by 


K 

Prf-K  s i 5 K]  = | -=  e*p(--f-  ) 

,K  /2T  V 2 / 


dx  (3.2-20) 


or 


Pr-UK  5 £ K 


{- 


/ var(  |) 


^ = a 


(3.2-21) 


where 


K 

. r i (^-x2A 

a - J 7=  expV~/ 


dx 


(3.2-22) 


It  is  our  interest  to  derive  the  confidence  interval  at  a given 
level  a for  the  parametric  function  $ . In  view  of  (3.2-21)  this  can 
be  given  by 


IK($)  = {*  - K(var($)$  , i + K(var(f))i}  (3.2-23) 


or 


I^(i)  = {uTx  - K(uTVu)1  , uTx  + K(uTVu)^j  (3.2-24) 


For  each  value  of  K,  the  corresponding  confidence  level  is  tabulated 
below  [3-2,  page  38]. 
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K 

a 

1.0 

0. 6827 

2.  0 

0.9545 

3.0 

0.9973 

4.  0 

0.9999 

It  is  possible  to  give  a geometrical  interpretation  of  the  con- 
fidence ellipsoid.  This  interpretation  will  provide  considerable  in- 
sight into  the  properties  of  the  estimators,  and  will  open  the  path 
toward  the  discussion  of  the  Influence  of  the  perturbations  in  the 
solution  of  the  linear  equations  involved.  Excellent  discussions  of 
this  interpretation  are  found  in  references  [3-2,  pages  40-58  and 
3-7,  pages  406-411]. 

Consider  the  expression  given  by  equation  (3.2-3).  For  a 
given  observed  value  y_,  that  expression  represents  a quadratic 
function  in  N-dimensional  space  of  the  x^  variables.  Under  the  over- 
determined model  the  solution  of  the  normal  equations  is  unique. 
Therefore,  the  minimum  of  the  quadratic  form  is  obtained  at  a 
unique  point  x . For  other  values  of  x the  residual  surface  assumes 
the  shape  of  a parabolloid.  Let  rQ  denote  the  minimum  value  of  the 
quadratic  expression.  Consider  the  expression 

(i-Hx)T  V_1(x-iix)  = rQ+K2  (3.2-25) 

Upon  the  substitution  of  the  value  of  £ given  by  the  solution  of  the 


46 


normal  equations,  it  is  easy  to  verify  that 

(x -x)T  HT  V"1  H(x-x)  = K2  (3.2-26) 

This  is  the  expression  of  an  ellipsoid  with  center  at  the  point  x in  the 
N -dimensional  x - space.  Following  reference  r3-2l  this  ellipsoid 
will  be  called  the  K - ellipsoid.  Figure  (3.2-1),  obtained  from  ref- 
erence [3-2,  page  42]  shows  the  residual  surface  and  the  K - ellip- 
soid for  N = 2. 

Consider  now  a vector  h in  the  N-dimensional  space.  For  a 
nondegenerate  ellipsoid  (this  is  the  case  with  the  overdetermined 
model),  there  will  be  two  (N-l)  dimensional  planes  orthogonal  to  h 
and  tangent  to  the  ellipsoid.  These  are  planes  such  that  the  ellipsoid 
lies  entirely  on  one  side  of  and  has  at  least  one  point  in  common  with 
them.  Following  Scheffe  [3-7]  this  planes  will  be  called  planes  of 
support  of  the  K - ellipsoid. 

On  the  other  hand,  equation  (3.  1-7)  gives  the  value  of  the 
estimator  of  the  parametric  function  $ as  expressed  by  the  inner 
product  of  the  vector  £ and  the  estimator  x . if  one  considers  the 
planes  of  support  of  the  K - ellipsoid  perpendicular  to  £,  their  ana- 
lytical expressions  will  be  given  by  [3-2,  page  41] 

= c X + K[c  (H  V H)  cp 


T 

C X 


(3.  2-27) 


, 


■ s. 

I 


I 

■fc. 


r 

* 


n 

5 


t 


I 


,jL 


(x -^)THTy_lH(x-t)  = K2 


Figure  (3,2-1)  The  Residual  Surface  and 
the  K -Ellipsoid  for  N=2 


r 
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Since 

var ( f)  = uT  V u = c^ (H^  V 21)  £.  (3.2-28) 

it  follows,  comparing  (3.  1-7),  (3.2-24)  and  (3.2-27)  , that  the  con- 
fidence interval  I^(  $)  can  be  given  by  the  distance  between  the  two 
points  where  the  planes  of  support  touch  the  K - ellipsoid.  Figure 
(3.2-2),  obtained  from  reference  [3-2,  page  43]  illustrates  the 
previous  assertion.  The  same  figure  also  shows  that  the  width  of 
the  confidence  interval  is  proportional  to  the  distance  between  the 
two  support  planes. 

Since  the  width  of  the  confidence  intervals  of  parametric 
functions  of  pixel  values  is  proportional  to  the  distance  between  the 
planes  of  support,  and  since  this  distance  will  vary  depending  on  the 
direction  of  the  vector  £with  respect  to  the  axes  of  the  ellipsoid,  it 
is  important  to  characterize  the  directions  of  these  axes  in  terms  of 
measurable  quantities. 

Equation  (3.2-26)  gives  the  analytical  expression  of  the  K - 

N 

ellipsoid.  If  a translation  of  origin  in  the  E space  is  made  through 
the  equation 


v 


(3.  2-29) 


equation  (3.2-26)  assumes  the  form 
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T T - 1 2 

v H V H v = K (3.2-30) 


In  order  to  find  the  directions  of  the  principal  axes  of  the  K - 
ellipsoid,  first  observe  that  a radius  vector  from  the  origin  to  any 
point  on  the  surface  of  the  ellipsoid  will  be  colinear  with  one  of  the 
principal  axes  if  and  only  if  it  will  be  in  the  direction  of  the  normal 
to  the  surface  at  that  point.  On  the  other  hand,  the  ellipsoid  can  be 
considered  as  an  equipotential  surface  [3-2,  page  45]  of  the  scalar 
field 


P(v) 


T 

v 


(3.2-31) 


so  that  the  normal  to  the  surface  can  be  obtained  by  the  direction  of 
the  gradient  vector 

V(p)  = 2HT  V"1  Hv  (3.2-32) 

Consequently,  the  problem  of  finding  the  principal  axes  of  the  ellip- 
soid reduces  to  the  one  of  finding  axes  that  are  colinear  with  the 
gradient  vector . This  is  expressed  by  the  following  equation  inj>, 
for  some  constant  \ 


(3.  2-33) 


The  previous  equation  represents  an  eigenvector -eigenvalue 
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T - 1 

problem  and  the  fact  that  H V H is  symmetric  and  positive  defi- 
nite (since  V is  positive  definite  and  H has  full  column  rank  in  the 
overdetermined  model)  guarantees  that  its  eigenvalues  are  all  real 
and  posidve.  The  eigenvectors  can  always  be  chosen  to  be  mutually 

orthogonal  and  these  will  be  the  directions  of  the  principal  axes. 

2 

Consider  now  the  diagonal  matrix  Q containing  the  eigen- 
T -1 

values  of  H V H in  decreasing  order.  Consider  also  the  unitary 

matrix  P such  that  its  columns  are  the  normalized  corresponding 

2 

eigenvectors.  The  matrix  fi  is  obtained  by  the  following  transfor- 
mation 

T T -1  2 

P H V HP  = n (3.2-34) 

In  order  to  obtain  the  axes  of  the  ellipsoid  a rotation  of  coordinates 
is  performed 

T 

L = P v (3.2-35) 

This  will  align  the  axes  of  the  ellipsoid  with  the  axes  of  the  coordi- 
nate system.  By  solving  for  v in  the  previous  equation  and  substi- 
tuting in  (3.2-30),  the  following  expression  is  obtained 

T T T -1  2 

r PA  H V HP  r = K (3.2-36) 


and,  using  (3.2-34),  this  reduces  to 
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It  also  follows  that  the  principal  axes  of  the  ellipsoid  have  lengths 
inversely  proportional  to  the  square  root  of  the  corresponding  eigen- 
values. 

Recall  that  the  width  of  the  confidence  interval  for  parametric 
functions  of  pixel  values  is  proportional  to  the  distance  between  the 
planes  of  support.  Now,  if  the  vector  c_  that  specifies  the  parametric 
function  is  parallel  to  an  eigenvector  that  corresponds  to  a small 
eigenvalue,  the  distance  between  the  planes  of  support  will  be  larger 
than  in  the  situation  where  c_  is  parallel  to  an  eigenvector  cor  res- 


ponding  to  a large  eigenvalue  of  H V H. 

So  far  the  confidence  interval  for  parametric  functions  of 

pixel  values  has  been  derived  under  the  assumption  that  the  variance 

of  the  noise  is  known  to  the  experimenter.  When  this  is  not  the  case 

the  confidence  interval  can  be  determined  as  follows.  First  observe 

T a X 

that,  under  white  noise  conditions,  £ x -£  x is  normally  distri- 

2 T T - 1 

buted  with  zero  mean  and  variance  0 c (H  H)  £.  Therefore,  the 
ratio 


^f(x  - x) 

/~T  T -1 
o /c  (H  H)  c 


(3.2-40) 


should  be  a standardized  Gaussian  random  variable.  The  parameter 


* L 

a is  not  known  but  it  has  already  been  derived  that  x and  s = 


1 T 

r(y-Hx)  (y  - Hx)  are  independently  distributed.  Therefore,  the 

M-N  2 


ratio  given  by  (3.2-40)  and 


(M-N)t 


are  also  independent.  Since 


( M-N)s  ^2  with  M-N  degrees  of  freedom,  it  follows 

a2 
that 


cT(x-x)  • v/M-N 


e /cT  (HTH) " 1 c _ cT(*-x) 

-7— =========  " 7TTT 

/m-N)s2  «/£  (H‘H)  *£ 


(3.2-41) 
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is  distributed  according  to  a Student  distribution  with  (M-N)  degrees 

of  freedom.  The  determination  of  the  confidence  interval  is  then 

easily  done  by  using  tables  of  this  distribution.  The  determination 

of  the  confidence  interval  for  the  unknown  variance  can  be  done  by 

2 

observing  that  ^ — has  a chi-square  distribution  with  (M-N) 

Li 

a 

degrees  of  freedom. 

3.  3 Analytic  Study  of  the  Condition  Number 

This  section  considers  the  effects  that  perturbations  on  the 
observed  blurred  pixels  have  on  the  estimated  original  pixel  values, 
from  the  complementary  point  of  view  of  the  numerical  analyst. 
Under  this  perspective, the  estimation  of  pixel  values  would  consist 
in  the  problem  of  solving  a system  of  linear  equations 

Hx  = i (3.  3-1) 

such  that  the  right  hand  side  is  subject  to  perturbations.  These 
errors  represent  the  role  of  the  noise  in  the  system.  Consider  the 
effect  of  the  perturbation  vector  n on  the  solution  of  the  system  of 
linear  equations.  Call 

x = x + Ax  (3.  3-2) 


the  solution,  x being  the  true  vector.  The  set  of  normal  equations 
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f 


i 


n 


n , 
*. 


* 

C 


k 

Jix 


(HT  V-1  H){x  + Ax)  = HT  V'1^  = HT  V_1(Hx  + n)  (3.3-3) 

gives  the  solution  for  the  perturbed  system.  Reduction  of  the  pre- 
vious equation  gives 

_H T V ~ * H Ax  = H T V " 1 n (3.3-4) 

At  this  point,  assuming  the  overde  ter  mined  model,  one  could  simply 
invert  (H  V H)  in  order  io  obtain  the  change  Ax  in  the  solution  of 
the  linear  system  due  to  the  perturbation  n.  A decomposition  of  the 
matrices  involved  will  be  performed,  however,  giving  more  insight 
into  the  problem  [3-2,  pages  47-58].  The  assumption  that  V is 
positive  definite  leads  to  the  possibility  of  a decomposition  of  the 
form 

V'1  = CTC  (3.3-5) 

so  that  equation  (3.4-4)  can  be  written  as 

(HT  CT  C H) Ax  = HT  CT  C n (3.3-6) 

A factorization  of  the  matrix  C_H  will  now  be  performed.  This 
is  the  so-called  singular  value  decomposition  of  a rectangular  matrix 
f 3-3,  page  38'1 

CH  = PL^Q 


(3.3-7) 
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where  Ij  is  the  (NxN)  diagonal  matrix  of  the  eigenvalues  of 
T T 

(H  C C.H),  P is  a unitary  matrix  whose  columns  are  the  eigen- 
T T 

vectors  of  (CHH  C_  ) and  Q is  also  a unitary  matrix  whose  rows 

T T 

are  the  eigenvectors  of  (H  £H).  As  a result,  equation  (3.3-6) 


can  be  rewritten 


(QT  LQ)£x  = QTL>PTCn  (3.3-8) 

or 

L Q Ax  = L^PTCn  (3.3-9) 

T T 

Since  (H  C_  C H)  is  nonsingular,  Ij  is  also  nonsingular  and  one  ob- 

T -1 

tains  (by  multiplying  both  sides  of  the  equation  by  Q _L  ) 

Ax  = QT  L ^ PT  C n (3.3-10) 

The  previous  equation  can  also  be  written 


Ax 


N 

= Z 

i = l 


(PT  Cn)t 

w. 


• % 


(3.3-11) 


T T 

where  are  the  rows  of  Q (eigenvector  s of  H C_  C^H)  and  w^, 

T T 

i = 1,  2,  • • • N are  the  square  root  of  the  eigenvalues  of  (II 

The  last  quantities  are  called  the  singular  values  of  the  matrix  C_H. 

Equation  (3.  3-11)  shows  that  the  component  of  the  error  along 

T - 1 

each  of  the  eigenvectors  of  (H  V II)  is  inversely  proportional  to 
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the  corresponding  singular  value  of  C_H . Assuming  that  the  com- 
T 

ponents  of  (P  C. n)  do  not  vary  much  in  magnitude,  the  components 
of  Ax  will  tend  to  be  larger  in  the  direction  of  eigenvectors  cor- 
responding to  smaller  singular  values. 

So  far  the  analysis  of  perturbations  has  been  restricted  to 
absolute  changes  in  the  least  squares  solution  due  to  errors  in  the 
observed  values.  The  next  step  consists  in  analyzing  relative 
changes  in  the  solution  due  to  perturbations  in  the  data  as  well  as  in 
the  matrix  H. 

Assume  for  the  sake  of  simplicity,  that  the  noise  is  white. 
This  implies  that  the  solution  to  the  normal  equations  is  given  by 

x = (HTHf1HTX  (3.3-12) 

As  pointed  out  before,  the  previous  expression  can  be  put  into  the 
form 

x = H+x  (3.3-13) 

where  H is  the  pseudoinverse  of  H. 

Call 


X = X + E 


(3.3-14) 


and  let  Xj  and  Xj  t>e  the  projections  of  x and  x respectively,  onto  the 
range  of  the  transformation  H, denoted  R(H).  Under  these  conditions, 
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the  following  bound  is  valid  for  the  relative  changes  in  the  solution 
to  the  least  squares  problems  T3-8,  page  221] 


H + £-H+xM  ^ c(H)  . 


(3.3  -15) 


hV 


where  c(H)=M.  Hh+I!  • t3'3'16* 

The  quantity  c(H)  is  called  the  condition  number  of  the  blur 

matrix  H.  It  plays  an  extremely  important  role  in  explaining  the 

effect  of  perturbations  on  the  accuracy  of  the  computations  involved. 

Equation  (3.  3-15)  can  be  obtained  by  the  following  reasoning. 

Decompose  x 1111:0  Xj  and_Z2  where  Xj  belongs  to  R(H)  and  x2  is  in  its 

T 

orthogonal  complement,  which  is  the  null  space  of  H , denoted  by 
N(HT).  Therefore, 

H+x  = H+X!  + = Elj  + (HTM)HTX2  = H+Xj  (3‘3'  17) 


Analogously, 


H+X  = H+Xi 


(3.3-18) 


where  X j is  the  projection  of  x onto  R(H)*  Hence 

HhVh^H  =llH+d1-i1)ll  * »H+».  "ij-Z!11  (3.3-19) 


On  the  other  hand,  it  is  easily  shown  r3-8,  page  220]  that  if 
H+y.  is  the  solution  to  the  least  squares  problem,  then 


and 


H H y_  = jjTj 


IhH^I!  £ II H II  . I!h+£^ 


(3.3-20) 

0 


(3.3-21) 


or 


(3.3-22) 


By  dividing  (3.3-19)  by  (3.3-22)  one  obtains  the  desired  result 

Observe  that  it  is  only  the  component  of  the  relative  error  in 
the  observed  vector  of  pixel  values  lying  on  the  range  of  the  blur 
matrix  that  contributes  to  the  relative  error  in  the  estimated  p xel 

value  s . 

The  condition  number  will  determine  the  effect  of  the  noise 
in  the  restoration  process.  If  its  value  is  small  a little  relative 
perturbation  on  the  observed  blurred  picture  will  not  produce  large 
relative  changes  in  the  restored  picture.  In  this  case,  the  normal 
equations  are  said  to  be  well  conditioned.  If,  on  the  other  hand,  the 
condition  number  has  a large  value,  small  relative  changes  in  the 
observed  values  may  greatly  affect  the  estimated  pixel  values  and 
the  normal  equations  are  said  to  be  ill  conditioned. 
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The  matrix  norm  used  in  the  expression  of  the  condition 
number  II H H • II  H+  II  can  be  any  one  of  the  matrix  norms  that  are 
consistent  with  the  vector  norm  used.  In  particular,  one  may  select 
the  spectral  norm,  which  is  equal  to  the  largest  singular  value  of 
the  matrix. 

In  order  to  find  the  largest  singular  value  of  the  pseudoinverse 
■f  -f 

H , (or  (C^H)  if  a colored  noise  corrupts  the  image)  the  factori- 
zation expressed  by  equation  (3.3-7)  will  be  used.  By  doing  this  and 
also  using  the  expression  for  the  pseudoinverse  one  obtains 


(CH)+  = [ (CH)T  c h]  " 1 (CH)T  = [ qti>  ptp  q 1 qV  pt 

(3.3-23) 

But  since 

PTP  = QQT  = l (3.3-24) 

and 

T 1 -1  T - 1 

(Q  L Q)  = Q L Q (3.3-25) 


it  follows  that 


+ T "1  T 
(CH)  =Q  L * P 


(3.3-26) 


The  matrix  (CH) 


+ 


is  an  NxM(M  ^ N)  matrix, 


so  its  singular 


values  are  calculated  by  the  positive  square  roots  of  the  matrix 
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+ + T T - 1 

(CH)  [(CH)  ] = Q L Q 


(3.  3-27) 


The  eigenvalues  of  this  matrix  in  factorized  form  are  given  by  the 

1 

. . . , 2 


-1  1 1 
diagonal  elements  of  L , namely,  “T”  , ^ 


As  a 


w 


1 


w„ 


w 


n 


consequence,  the  singular  values  of  (CH)  are  the  reciprocal  of  the 
singular  values  of  CH.  The  largest  singular  of  (CH)  is 


w 


n 


Therefore,  the  condition  number  is  given  by 


w 

ll(CH)!!||(CH)+||  (3.3-28) 

n 

that  means  that  this  number  is  the  ratio  of  the  largest  to  the  smal- 
lest singular  value  of  the  matrix  (CH). 

A further  insight  can  be  obtained  by  considering  the  re- 
lationship between  the  condition  number  and  the  K -ellipsoid  [ 3-2, 
page  54].  Equation  (3.2-39)  expresses  the  relationship  between  the 
length  of  an  axis  of  the  ellipsoid  and  the  corresponding  singular 
value.  Using  that  expression  one  may  immediately  conclude  that 
the  ratio  of  the  largest  to  the  smallest  singular  value  of  (CH)  is  also 
the  ratio  of  the  largest  to  the  smallest  axis  of  the  ellipsoid.  T1  is 
means  that  the  more  the  ellipsoid  departs  from  the  shape  of  a sphere, 
the  more  ill  conditioned  the  restoration  problem  will  be.  Figure 
(3.3-1)  obtained  from  reference  [3-2,  page  54]  shows  the 
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comparison  between  the  K -ellipsoids  for  the  well  and  ill  conditioned 
system,  for  the  case  where  N = 2. 

When  the  noise  is  Gaussian,  the  observed  blurred  and  noisy 
pixel  values  are  also  normally  distributed,  under  the  linear  model. 
In  this  case  one  can  define  ellipsoids  centered  at  the  mean  value  of 
the  observed  pixel  values  (it  may  be  assumed  to  be  the  origin  for 
convenience)  and  containing  a given  percentage  of  this  multivariate 
distribution. 

Since  the  estimated  pixel  values  are  obtained  by  combining 
linearly  the  observed  values,  it  follows  that  the  estimators  are  also 
Gaussian  distributed. 

It  is  possible  to  show  [ reference  3-2,  pages  55-58]  that  if 

the  ellipsoid  for  the  observed  values  in  a regression  model  has  the 
exp*  e s si  on 


V"1  (rHx)  * p2  (3.3-29) 

then  the  corresponding  ellipsoids  of  the  estimators  are  given  by 

(*-S)THT  y-!H(x-i)  s p2  (3.3.30) 

This  ellipsoid  essentially  gives  the  multidimensional  confidence 
interval  for  the  pixel  values  under  the  normality  assumption.  The 
eigenvectors  and  eigenvalues  <HT  V"1  H)  will  determine  the  size 
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and  shape  o£  the  ellipsoid. 

The  attention  will  be  devoted  now  to  the  problem  of  con- 
sidering the  effects  of  perturbations  of  the  blur  matrix  H on  the 
restoration  problem.  This  question  is  of  extreme  importance  since 
the  experimenter  rarely  knows  with  great  precision  the  spread  func- 
tion. This  is  particularly  true  when  that  function  is  derived  from 
measurements  that  inevitably  involve  errors. 

The  analysis  of  the  effect  of  the  perturbation  on  the  blur 
matrix  is  quite  involved.  In  order  to  do  this  a new  terminology  is 

introduced  by  Stewart  T 3-91 . Let  Ebe  a perturbation  matrix  on  the 

M 

blur  matrix  H and  S be  a sub  space  of  R . Each  column  of  !E  is  an 

M -vector  that  can  be  projected  onto  S.  Call  and  — the 

projections  of  E(y)  onto  the  range  of  the  blur  matrix,  denoted  by  R(H) 

T 

and  its  orthogonal  complement  (N(H  )),  respectively. 

Assuming  the  over  deter  mined  model  and  if 


(3.3-31) 


then  the  columns  of  (A  + E)  are  linearly  independent.  Also,  assuming 
that 

x = H+x  (3.3-32) 


and 
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(3. 3-33) 


x = (H  + E)+  x 


then 


Hx-xH 

II 2 II 


2c  (H)  • 


+ 4c2(H)  . 


II  —2  II  11*^2  II 
II— II  * lUill 


+ 8c  3 (H) 


(3.3-34/ 


where  c(H)  = IIh  II  . Hh/  ! 


(3.3-35) 


is  the  condition  number  of  H and  the  consistent  Euclidean  norm  for 
vector  and  Frobenius  norm  for  matrices  is  used.  The  third  term 
in  the  bound  depends  on  the  square  of  11  E II  2 and  usually  can  be 
disregarded  when  compared  to  the  first  two  terms.  The  first  term 
is  similar  to  the  bound  that  can  be  derived  for  perturbations  in  non- 
singular linear  systems. 

The  second  term  states  that  the  relative  perturbation  in 
T 11*211 

N(H  ) is  amplified  by  c(H)  • jj tt  . Since  ^ is  the  projection  of 


£ onto  R(H)  and  ^ is  the  projection  on  its  orthogonal  complement, 


namely  N(H  ),  it  follows  that  the  ratio 


measures  how  nearly 


X lies  with  respect  to  R(H).  If  y_  is  close  to  R(H/  this  ratio  will  be 
small.  If  j]  ||  and  ||E_  j|  are  of  the  same  order  of  magnitude  then 
the  first  term  tends  to  dominate  when  ^ is  small.  If,  on  the  other 
hand,  is  large,  the  second  term  is  prevalent.  Stewart  states 
loosely  that  "if  £ very  nearly  lies  in  R(H),  then  c(H)  is  the  condition 
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number  for  the  least  squares  problem,  otherwise  c (H)  is  the  condi- 
tion number.  11 

An  important  conclusion  in  the  case  of  image  restoration  can 

be  drawn.  If  there  is  a small  amount  of  noise  present  in  the  observed 

pixel  values  (high  signal  to  noise  ratio),  the  will  tend  to  be  near 

R(H),  which  implies  that  c(H)  will  be  the  condition  number.  If,  on  the 

contrary,  the  signal  * o noise  ratio  is  low,  the  component  of  the  noise 

2 

will  tend  to  place  farther  away  from  R(H)  and  in  this  case  c (H)  will 

be  the  condition  number.  Since  c(H)  is  always  greater  or  equal  to 

one,  the  latter  case  is  certainly  a worse  situation.  Incidentally,  it 

2 

should  be  remarked  that  c (H)  is  the  condition  number  of  the  matrix 
(HTH). 


3.4  The  Underdetermined  Model 

So  far  the  study  of  the  image  restoration  problem  has  been 
essentially  restricted  to  the  over  deter  mined  model.  This  means  that 
the  (M  x N)  blur  matrix  H is  assumed  to  have  rank  N.  In  other  words, 
the  columns  of  this  matrix  are  supposed  to  be  linearly  independent. 

On  the  other  hand,  if,  in  the  discretization  method  of  the  con- 
tinuous planar  equation  that  describes  the  blurring  process,  a number 
of  nodes  for  the  quadrature  formula  is  selected  exceeding  the  number 
of  observed  va'ues  (i.e.,  M <N),  this  condition  is  violated  and  the 
rank  of  the  blur  matrix  K is  necessarily  less  than  N. 
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Under  the  overdetermined  model  there  was  a unique  solution 
for  the  restoration  problem  given  by  the  set  of  normal  equations. 
Furthermore,  estimators  and  finite  confidence  intervals  were  ob- 
tained for  every  parametric  function  of  the  pixel  values.  Also,  a 
finite  condition  number  was  obtained  by  considering  the  ratio  of  the 
largest  and  the  smallest  singular  values  of  the  matrix  CH. 

If  H has  not  full  column  rank  several  important  consequences 
are  immediately  derived.  First,  the  uniqueness  of  the  set  of  normal 
equations  cannot  be  guaranteed  any  more,  since  the  matrix  (H  ^ V- *H) 
is  singular  and  therefore  cannot  be  inverted.  Second,  the  smallest 
singular  value  of  the  matrix  CH  is  zero,  resulting  in  a condition  num- 
ber with  an  infinite  value.  As  a consequence  of  this  fact,  many  linear 
combinations  of  pixel  values  have  an  infinite  confidence  interval, 
which  is  equivalent  to  ray  that  these  functions  are  not  estimable 

There  is  a concept  that  not  only  is  necessary  for  the  study  of 
underdeter min?d  systems  but  also  broadens  the  view  over  the  over- 
determined systems,  unifying  the  whole  study  of  the  linear  model  in 
regression  analysis.  It  is  the  concept  of  the  generalized  inverse  ox  a 
matrix,  which  was  mentioned  briefly  in  connection  with  the  treatment 
of  the  overdetermined  model  and  now  is  more  fvlly  treated. 

Initially,  a brief  survev  of  generalized  inverse  concepts  will 
be  presented.  There  are  several  ways  of  presenting  these  concepts. 
The  presentation  contained  in  reference  [3  -3]  will  be  followed. 
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Given  an  M x N matrix  H,  the  matrix  H+  obtained  by  the 
following  limiting  operations 

+ T 2 - 1 T 

H = lim(H  H+a  I)  H 

a-»  0 

T T 2-1 
= limH  (HH  +a  I) 

a-»0  (3.4-1) 

always  exists.  Also,  for  any  (M  x 1)  vector  the  vector 

x = H+_y  (3.4-2) 

is  the  vector  of  minimum  norm  among  those  that  minimize 

(3.4-3) 

A T 

It  can  be  shown  that  x is  the  unique  vector  in  R(H  ) satisfying  the 
equation 

Hx=i  (3.4-4) 

where  £ is  the  projection  of  ^ on  R(H).  This  vector  j!c  satisfies  the 
set  of  normal  equations 

HTHx  = HTx  (3.4-5) 

The  unique  matrix  H is  celled  the  generalized  inverse  or  the  pseu- 
doinverse of  the  matrix  H. 

As  a corollary  of  expressions  (3.4-1)  and  (3.4-2),  it  follows 
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that  for  any  vector  £,  HH+x  is  the  projection  of  jf  on  R(H),  (I-HH  )y_ 

is  the  projection  of  x on  N(HT).  Furthermore,  for  any  vector  x, 

H + Hx  is  the  projection  of  x on  K(HT ) and  (I-H  H)x  is  the  projection 

of  x on  N(H).  It  should  be  observed  at  this  point  that  a projection 

2 

matrix  P is  idempotent,  i.  e. , P 

If  H is  square  and  nonsingular,  il+  is  the  inverse  of_H  , H . 
If  the  columns  of  H are  linearly  independent,  like  in  the  overdeter- 


mined model,  the  pseudoinverse  is  given  by 


T -IT 

= (H  H)  H 


(3.4-6) 


If,  on  the  other  hand,  the  rows  of  H are  linearly  independent,  the 
pseudoinverse  will  be  represented  as 


T T -1 

= H (MH  ) 


(3.4-7) 


A better  perspective  over  the  pseudoinverse  can  be 
considering  some  specific  cases.  Take,  for  example,  the 
matrix  H,  represented  tv  the  value  h.  In  this  case, 

H + = 0 if  h = 0 

= z ifh  *0 

h 


obtained  by 
(lx  1) 

(3.4-8) 


If  H is  diagonal, 


H = diag  (hj,  h ^ ...  h^) 


(3.4-9) 


then 


dlag  (h,+.  h2+,  •••hM+) 


(3.4-10) 


where 


h£  = 0 ifhj  = 0 


T — if  h.  i 0 

hi  1 


(3.4-11) 


If  H is  a symmetric  (M  x M)  matrix,  it  is  possible  to  repre- 
sent it  in  the  following  form 


H = T D T 


(3.4-12) 


where  T is  an  orthogonal  matrix  and  D is  diagonal.  Using  (3.4-1), 
H*  can  be  expressed  as 


= limT(D2  + q2I)"1DTT 
a -»  0 

2 2-1  T 

= T lim(D  + a I)  D T 

a -»  0 

+ T 
= TD  T 


(3.4-13) 


As  a result,  the  pseudoinverse  of  a symmetric  matrix  can  be  obtained 


by  pseudoinver ting  the  diagonal  matrix  that  consists  of  its  eigenvalues 

. _+  -1 

If  H is  nonsingular,  all  the  eigenvalues  are  nonzero  and  D-D.  .so 
that  H+  = H . 

This  result  on  symmetric  matrices  leads  to  spectral  repre- 
sentations for  the  pseudoinverse  matrix.  If  the  columns  of  T are  de- 
noted by  t.  , t_  , 1M  and  the  eigenvalues  of  H by  > Xj,  • • • , 
the  matrix  H can  be  represented  as 


M 

H = S 

i = l 


T 

X.  t,t. 

i “i-J 


(3.4-15) 


and  the  pseudoinverse  H by 


M 


= I)  * 

i=i 


\tT 
i -i-J 


(3.4-15) 


where  \ + has  the  same  meaning  as  in  (3.4-11). 

Two  results  that  will  be  useful  in  the  analysis  of  the  under  de- 
termined model  of  the  restoration  process  are  now  stated.  For  any 
matrix  H,  x belongs  to  the  null  space  of  H if  and  only  if 

x = (I_  - H +H )y;  (3.4-16) 

for  some  vector  For  any  matrix  H,  z belongs  to  the  range  of  H if 


and  only  if 
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z = HH+u_  (3.4-17) 

for  some  vector  u. 

For  any  rectangular  matrix  H,  the  pseudoinverse  can  be  ex- 
pressed in  terms  of  pseudoinverses  of  symmetric  square  matrices  as 
follows 

+ T + T T T 4- 

H = (H  H)  H = H (HH  ) (3.4-18) 

It  can  be  shown  that  an  entirely  equivalent  way  of  introducing 
the  pseudoinverse  exists.  This  is  the  so  called  Penrose  characteri- 
zation [3-3,  page  28].  A matrix  H+  is  said  to  be  the  pseudoinverse 
of  a matrix  H if  and  only  if  the  four  conditions  are  satisfied 

HH  and  H H are  symmetric 
HH+H  = H 

and  H + HH+  = H+ 

These  results  can  now  be  applied  to  the  restoration  problem. 
Consider  first  the  no  noise  case 


JL  =H*  (3.4-20) 

where  x represents  the  vector  of  pixel  values,  H is  the  blur  matrix 
and  y is  the  vector  of  observed  values.  No 


restriction  is  placed  on  the 
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dimension  of  the  blur  matrix,  either  the  overdetermined  or  the  un- 
derdetermined models  could  be  involved. 

Consider  first  the  problem  of  existence  of  solution.  In  order 
for  a solution  to  exist  x must  be  on  R(H).  By  (3.4-17)  this  occurs  if 
and  only  if  y.  = HH+u  for  some  u_.  On  the  other  hand,  since  HH*  is  a 
projection  on  R(H),  it  follows  that 

HH+y.  = (HH+)2u  = HH+u  = £ (3.4-21) 

The  condition  expressed  by  the  previous  equation  is  the  so  called  con- 
sistency condition  for  the  solution  of  a linear  system. 

At  this  point  it  is  perhaps  useful  to  point  out  that,  under  no 
noise,  for  real  situations,  ^ will  always  be  in  R(H)  since  it  was  ob- 
tained by  blurring  ar.  existing  picture.  This  is  why  the  restoration 
problem  is  then  formulated  as  searching  for  the  solution  of  the  linear 
system  (3.4-20)  instead  of  directly  solving  for  the  least  squares 
problem. 

Turning  now  to  the  problem  of  uniqueness  of  the  solution,  the 
homogeneous  system  Hx  = (1  has  to  be  investigated.  Observe  that, 
for  any  vector  v 

x = (I_  - H+H)v  (3.4-22) 


is  a solution  to  the  homogeneous  system  since 
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Hx  = H(I-H+H)v  = (H-HH+H)v  = 0 (3.4-23) 

where  in  the  last  equality  one  of  the  conditions  expressed  by  (3.4-19) 
was  used.  Therefore,  if  H H t Ij  a nonzero  vector  can  always  be 
added  to  the  solution,  without  changing  the  left  hand  side  of  system 
(3.4-20).  A necessary  and  sufficient  condition  can  be  expressed  by 

H+H  = I (3.4-24) 

this  condition  being  equivalent  to  the  statement  that  N(H)  consists  only 
of  the  zero  vector. 

A general  solution  to  thy  linear  system  (3.4-20)  can  be  ex- 
pressed as 

x = H+x  + (I-H+H)v  (3.4-25) 

where  v is  an  arbitrary  vector. 

In  the  case  of  the  over  determined  system,  the  blur  matrix  H 
has  linearly  independent  columns,  H+  is  expressed  by  (3.4-6)  and  the 
condition  (3.4-24)  is  satisfied  so  that  the  unique  solution  is  given  by 

x = H+x  (3.4-26) 

For  the  underdetermined  model,  condition  (3.4-24)  is  not  satisfied 
and  there  will  not  be  a unique  solution  to  the  system  of  linear 


w 
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equations. 

The  vector  H jis  the  minimum  norm  solution.  This  can  be 
verified  by  noticing  that  the  set  of  vectors  given  by  (I_-H*H)v  is  ortho- 
gonal to  H ^ as  shown  below 

L(i-H+H)v]T  . = vT(I-H+H)Th\ 

— X (3.4-27) 

= vT(I-H+H)H+x  = vT(H+-H+HH+)X  = 0 

where  the  second  equality  used  the  fact  that  (I_-H+H)  is  a symmetric 
matrix  and  the  fourth  equality  was  based  on  one  of  the  relations  in 
(3.4-19).  Figure  (3.  4-1)  taken  from  reference  [3-2,  page  63],  shows 
the  geometry  of  the  solutions  to  the  linear  system  in  the  underdeter- 
mined case,  when  N = 2 and  the  dimension  of  N(H)  is  1. 

Now,  suppose  that  noise  is  added  to  the  system  so  that 

£ = Hx  + n (3.4-28) 

In  thi3  case  one  would  search  for  an  estimator  x of  x under  some 
meaningful  rtatistical  criterion.  In  the  overdetermined  case  the  best 
linear  unbiased  estimator  (B.  L.U.E.)  has  already  been  obtained  and 
it  was  shown  to  be  unique.  Suppose,  therefore,  that  one  is  looking 
for  a B.  L.U.E.  estimator  in  the  underdetermined  model,  where  rank 
(H)  < N.  Assume  that  a linear  estimator  of  the  form 
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x 


(3.4-29) 


is  used,  where  U is  an  (M  x N)  matrix. 

By  imposing  the  unbiasedness  condition,  the  following  ex- 
pression has  to  be  valid,  for  any  value  of  x 


U 


x 


(3.4-30) 


Thus,  it  follows  that 


T 

a a -iN 


(3.4-31) 


But  if  this  is  true,  the  rank  of_I^.,  N would  be  larger  than  the 
rank  of  one  of  its  factors  (H,  with  rank  < N)  which  contradicts  the 
Sylvester  Inequality  for  the  product  of  matrices.  Therefore,  there  is 
no  unbiased  linear  estimator  for  the  vector  of  pixel  values  x. 

This  fact  greatly  limits  the  usefulness  of  the  underdetermined 
model.  This  can  be  viewed  from  the  perspective  of  being  the  price 
paid  for  increasing  the  number  of  quadrature  nodes  above  the  number 
of  observed  values.  Because  of  lack  of  information  an  unbiased  esti- 
mator for  the  pixel  values  cannot  be  obtained. 

However,  restoration  can  be  attempted  according  to  another 
criterion,  namely,  the  minimization  of  the  least  squares  quadratic 


form 


w 
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-1, 


0(x)  = (i-Hx)V  (x-Hx) 


(3.4-32) 


where  V is  the  covariance  matrix  of  the  noise  that  corrupts  the 
pic  tur  e . 

The  set  of  normal  equations  is  represented  by 


T - 1 T - 1 

(H  V H)x  = H V 


(3.4-33) 


By  performing  the  factorization 


- 1 T 
V = C C 


(3.4-34) 


(3.4-33)  can  be  expressed  as 


T T T T 

(H  C CH)x  = H C Cy_ 


(3.4-35) 


1 


1 


in  order  to  check  whether  the  system  is  solvable,  the  consistency 

condition  of  equation  (3.4-21)  would  have  to  be  checked.  Instead  of 

doing  this,  a simpler  way  would  be  to  observe  tl\at  R((CH)  ) = 

R((C^H)  CH).  Since  H ~ the  range  of  = (iCH)^,  it 

T T T 

must  be  in  the  range  ofH  C = (£12)  C^H.  As  a result,  it  must 

A T 

be  the  image  of  some  x under  the  transformation  (C H)  C^H.  In  other 
words,  the  set  of  normal  equations  is  always  consistent. 

Using  (3.4-25)  the  general  solution  of  this  system,  of  linear 
equations  is  given  by 
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a TT  + TT  _ TT  ■lt'T’ 
x = (H  C CH)  H C C*  + [l_-  (H CTCH)+(HTCTCH)]v  (3.4-36) 

for  an  arbitrary  vector  v. 

Taking  into  consideration  (3.4-18),  the  previous  expression 
can  be  reduced  to 

x = (CH)+C*  + [l_-  (CH)+  CH]v  (3.4-37) 

The  solution  of  the  least  squares  problem  is,  therefore,  not 
unique  and  any  vector  in  the  set  expressed  by  the  previous  equation 
minimizes  the  quadratic  form.  The  vector  (CH)+C£is  now  merely 
the  smallest  norm  solution  that  gives  this  minimum  value. 

In  Figure  (3.4-2),  taken  from  reference  [3-2,  page  65],  the 
geometry  of  the  least  squares  problem  for  the  underdetermined 
model  is  shown.  The  surface  of  the  quadratic  form  (3.4-32)  is  in- 
finitely long  in  the  directions  of  the  eigenvectors  of  HTV~1H  = 

T T 

— C CH  corresponding  to  the  zero  singular  valuej  of  (CH).  The  set 
of  solutions  given  by  (3.  -4 -37)  is  the  projection  on  the  x space  of  the 
bottom  of  the  infinitely  long  quadratic  through.  The  K -ellipsoids  are 
degenerate,  being  infinitely  long  in  the  directions  of  the  mentioned 
eigenvectors.  One  of  the  principal  axes  of  these  ellipsoids  is  given 
by  the  solution  set  of  the  least  squares  problem.  The  number  of 
dimensions  whore  the  ellipsoid  is  infinite  is  the  dimension  of  N(H), 
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assuming  that  is  positive  definite. 

Suppose  now  that  the  experimenter  is  interested  in  estimating 
parametric  functions  of  the  pixel  values,  represented  by  the  inner 
product 

T 

f = £ x (3.4-38) 

A linear  combination  of  the  observed  pixel  value r is  used  to  perform 
this  estimation 

a ip 

$ = u £ (3.4-39) 

The  requirement  of  unbiasedness  implies  that 

E(uT;y)  = cTx  (3.4-40) 

which  in  turn  leads  to 

uTHx  = £Tx  (3.4-41) 

for  any  value  of  sc.  Therefore,  the  following  equality  must  be  valid 

T T 

u K = c (3.4-42) 

At  this  point  the  analogy  between  (3.4-31)  and  (3.4-42)  is  clear.  The 
previous  equation  can  be  also  expressed  by  stating  that  there  must 


exist  a vector  u such  that 
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u 


c 


(3.4-43) 


In  other  words,  the  vector  £ has  to  be  a linear  combination  of  the 
T 

columns  of  H , a condition  that  can  also  be  expressed  by  saying 

T 

that  £ belongs  to  R(H  ) or  still  that 


H He  = c 


(3.4-44) 


+ T T 

since  H J4  is  a projection  matrix  onto  R(H  ).  The  term  R(H  ) has 

dimension,  say  K < N,  w h will  determine  the  number  of  linearly 

independent  parametric  functions  of  pixel  values  that  e estimable. 
T 

The  functions  c^  x, given  by 


c , — (0,  • • • • , 0)  i - 1,2, 


N 


(3.4-45) 


form  a set  of  N linearly  independent  parametric  functions  that  can- 
not all  be  estimated  by  an  unbiased  estimator.  This  confirms  the 
result  that  the  whole  vector  of  pixel  values  is  inestimable. 

If  $ is  an  estimable  parametric  function,  it  can  be  proved 

A 

that  the  estimator  $ is  given  by 


S = £T(CH)+Cy.  = £Tx 


(3.4-46) 


where 
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x = (CH)+Cy  (3.4-47) 

is  the  minimum  norm  solution  to  the  least  squares  problem.  The 
variance  of  this  estimator  is  given  by 


2 

a 


($) 


T 

u Vu 


(3.4-48) 


where 


T 

u 


T A 

= £ (CH)  £_ 


(3.4-4?) 


As  far  as  confidence  intervals  and  hypothesis  testing  for  para- 
metric functions  of  pixel  values,  die  analysis  can  be  carried  out  in  a 
manner  analogous  to  the  overdetermined  model.  The  confidence  in- 
terval will  be  finite  for  estimable  functions  and  infinite  for  inestimable 
functions.  Similarly,  hypothesis  involving  estimable  functions  will 
be  testable,  while  those  involving  nonestimable  functions  will  be 
nonte stable.  (See  Appendix  A.) 


4.  CONSTRAINED  RESTORATION 


In  the  previous  chapter  the  restoration  problem  was  solved 
by  application  of  regression  techniques.  However,  the  experimental 
results  will  show  that  the  problem  can  often  be  ill  conditioned.  This 
fact  can  be  expressed  by  the  large  variances  of  the  estimators  of  the 
individual  pixel  values  or  linear  combinations  of  them.  On  the  other 
hand,  these  techniques  make  use  of  the  minimum  possible  amour t of 
a priori  information  about  the  image  to  be  restored.  Pixel  values  are 
simply  regarded  as  parameters  to  be  determined  In  a N2  dimensional 
space.  Through  the  use  of  some  additional  a priori  information  it  is 
possible  to  reduce  considerably  the  uncertainty  about  the  estimators. 
This  can  be  done  in  several  forms. 

4.  1 Analysis  of  Established  Techniques 

The  classical  Bayesian  approach  consists  of  assuming  an  a 
priori  joint  probability  density  on  the  pixel  values  to  be  estimated. 

The  problem  of  estimating  these  values  under  a meaningful  criterion 
like  the  mean  square  error,  for  any  kind  of  a priori  densities  can  be 
very  complex,  involving  nonlinear  filters.  If  only  linear  operations 
are  allowed  or  if  only  second  moments  are  known  or  if  both  noise  and 
pixel  vectors  are  gaussian  distributed  and  any  operation  is  allowed, 
the  optimum  procedure  is  the  well  known  Wiener  filter. 
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The  problem  can  be  formulated  by  the  linear  model 

X = Hx  + n (4.  1-1) 

2 2 2 

where  H is  an  (M  x N ) matrix,  is  the  (M  x 1)  vector  of  observed 

2 

pixel  values,  x is  the  (N  x 1)  vector  of  pixels  to  be  estimated  and  n 
2 

is  the  (M  x 1)  noise  vector.  The  covariance  matrices  of  both  n and  x 
are  known  to  be  respectively,  V and  C , assumed  to  be  positive 
definite.  Zero  means  and  unc  or  relatedness  of  signal  and  noise  are 
assumed  for  simplicity.  It  is  desired  to  estimate  the  vector  of  ran- 
dom variables  x by  means  of  a linear  operation 

x = Gy_  (4.  1-2) 

T 

in  such  a way  that  the  covariance  matrix  E f (x-x)  (x-x)]  is  minimized 
in  the  sense  described  in  section  3.  1 of  chapter  3.  It  is  well  known 
that  G can  be  obtained  by  imposing  the  orthogonality  principle,  that 
leads  to  the  result 

T T -1  -1  T-1-1T-1 

G = C H (HC  H +V)  = (C  +H  V H)  H C (4.1-3) 

— — XX  XX-  — “XX  — — — — — 

and  the  covariance  matrix  of  the  estimator  is  given  by 

C HT(HC  HT+V)"1HC  (4.1-4) 

— XX“ XX”  XX 


A connection  between  the  Wiener  filtering  technique  and  the 
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regression  techniques  can  be  drawn.  Assume  that  both  signal  and 
noise  are  uncorrelated,  that  is 


V = (T  1 


(4. 1-5) 


and 


C = 6£,I 
~xx  — 


(4.  1-6) 


;umes 


Now  suppose  that  the  value  of  62  fs  fixed  at  unity  and  that  02  ass. 
progressively  lower  values  approaching  zero.  Under  these  conditions, 
the  first  expression  for  G assumes  the  form 


G = lim  H * (HH 1 + a 2 I)  " 1 = H 


fT-*0 


(4.  1-7) 


where  equation  (3.4-1)  has  been  used.  The  same  result  could  be  ob- 
lalned  by  fixing  the  noise  level  r,1  - 1 and  letting  the  variance  of  the 
signal,  62,  g„  to  Infinity.  In  this  case  the  second  expression  for  G 


would  be 


(4. 1-8) 


In  either  case,  the  pseudoinverse  is  obtained  when  the  variance  of  the 
a priori  distribution  on  the  pixel  values  is  much  greater  than  the  noise 
variance.  This  corresponds  to 


assuming  essentially  no  a priori 
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knowledge  about  the  parameters  to  be  estimated,  which  is  exactly 
what  the  regression  techniques  do. 

It  is  interesting  at  this  point  to  relate  expressions  (4.  1-7)  or 
(4.  1-8)  to  the  method  of  ridge  regression  [2-21,  2-22,  and  2-23]  or 
Twomey's  method  [2-19  and  2-20]  for  the  case  where  the  matrix  V 
is  the  identity.  The  fact  that  there  is  some  probabilistic  prior  infor- 
mation about  the  vector  of  pixel  values  plays  the  same  role  in  the 
computational  procedure  as  the  damping  factor  V in  (2.4-7),  for 
example. 

Instead  of  a probabilistic  a priori  information  one  could  also 
incorporate  deterministic  a priori  constraints.  These  could  be  de- 
rived, for  example,  from  the  knowledge  of  some  physical  restrictions 
that  the  solution  must  satisfy.  The  methods  described  in  chapter  2, 
Phillips',  Twomey's  and  Tichonov's,  can  be  regarded  as  equality 
constrained  methods  where  the  constraint  expresses  some  degree  of 
smoothness  that  the  solution  of  the  least  squares  problem  must 
possess. 

In  the  following  discussion  a framework  to  understand  these 
equality  constrained  methods  will  be  formulated.  This  will  not  only 
help  to  get  a better  understanding  of  these  methods  but  also  will  make 
the  connection  between  them  and  the  linear  equality  and  inequality 


methods  to  be  described  later. 
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Phillips',  Twomey's  or  Tichonov's  method  can  be  described 

T 

as  searching  for  the  minimum  of  a quadratic  form  x Cx  that  ex- 
presses the  smoothness  requirement,  subject  to  an  equality  con- 
straint on  the  residual  vector 


til  -Hx)T(y_  -Hx)  = e (4.1-9) 

By  imposing  this  restriction,  the  stationary  point  of  the  Lagrangean 
expression 


F(x , >)  = xTCx  + \[(£-Hx)T(i!-Hx)-e]  (4.1-10) 

is  searched  for.  Taking  the  derivatives  with  respect  to  x and  X and 
setting  them  to  zero  one  obtains 

9F  T T 

— = 2Cx  + x[-ZH  y + 2H  Hx]  = 0 (4.1-11) 

qX  ““ 

9F  T 

= (y  - Hx)  (y-Hx)  -e  = 0 (4.1-12) 

From  (4.  1-11)  it  follows  that 


(C  + XHTH)x  = XHTy 


(4.1-13) 
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or 


, dividing  by  X,  and  solving  for  x 


* = (hTH  + \ c)  H1 


(4.  1-14) 


where  X Is  chosen  such  that  (4.  1-9)  Is  satisfied. 

Now  consider  another  problem  related  to  the  previous  one.  It 
consists  of  solving  the  least  squares  problem,  tot  is.  minimizing  the 
norm  of  the  residual  (jr-Hx).  but  with  the  additional  restriction  of 
satisfying  an  equality  constraint  expressed  by 


x Cx  = d 


(4.1-15) 


In  this  case  the  Lagrangean  expression  is 


G(x,  M = (i-Hx)T(i-H£)  + v[xTCx-dl  (4.1-16) 


Setting  the  derivatives  equal  to  zero,  it  follows  that 


32.  = 2HTy  + 2HTHx  + 2yCx  = 0 
2x  — 


(4.1-17) 


22-  = xTCx  - d = 0 

9y 


(4. 1-18) 


Solving  for  x from  the  first  equation  one  obtains 
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x = (HTH  + VCfV*  (4.1-19) 

A comparison  of  (4.  1-14)  and  (4.1-19)  reveals  that  the  two  somehow 
inverse  problems  are  solved  by  the  same  expression,  the  only  dif- 
ference being  the  Lagrange  multipliers  which  are  inverse  of  one 
another.  This  can  be  viewed  from  a geometrical  point  of  view,  ex- 
pressed by  Figure  (4.  1-1).  In  two-dimensional  space,  the  contours 

T T 

of  constant  value  of  both  quadratic  forms,  (i-Hx)  (y.-Hx)  and  x Cx 

T 

are  represented.  The  same  solution  is  obtained  if  x Cx_is  minimized 

subject  to  (£-Hx)T(x-Hx)  = e or  if  (£-Hx)T(y-Hx)  is  minimized  sub- 
T 

ject  to  x Cx  = d. 

Phillips',  Twomey's  and  Tichonov's  methods  can  therefore  be 
regarded  as  iterative  methods  to  solve  the  equality  constrained  quad- 
ratic minimization  problem,  with  the  equality  being  also  expressed  by 
a quadratic  expression.  For  V = 0,  the  solution  is  equivalent  to  un- 
constrained estimation;  it  will  exhibit  no  bias  and  the  covariance 
matrix  of  the  estimator  is  (H  H)"  . For  a value  of  Y ^ 0 this  will 
correspond  to  imposing  a constraint  expressed  by  some  quadratic 
form.  The  variance  is  reduced  because  the  solution  is  now  restricted 
to  the  contour,  but  bias  is  introduced.  When  Y tends  to  infinity,  the 

estimator  will  be  given  by  the  origin  of  the  x-space.  This  corres- 

T 

ponds  to  imposing  the  quadratic  constraint  x 0x_»  var^ance  WH1 
be  reduced  to  zero  and  the  bias  will  be  finite  and  given  by  the 
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Figure  (4.  1-1)  Geometry  of  the  Smoothing 
and  Regularization  Methods 


difference  between  the  true  solution  x and  the  zero  vector. 

A problem  that  occurs  with  the  smoothing  and  regularizing 
techniques  is  that,  even  though  the  variance  of  the  solution  can  be 
calculated,  the  bias  is  unknown.  It  is  true,  however,  that,  in  prac- 
tice, it  has  been  observed  that  there  is  considerable  reduction  in 
variance  for  a small  amount  of  bias.  A possible  measure  of  the 
quality  of  the  estimator  would  be  the  mean  square  error,  computed  by 
the  square  of  the  norm  of  the  bias  plus  the  sum  of  the  variances  of 
the  individual  components  of  the  estimator.  The  fact  that  the  con- 
straint is  quadratic  makes  it  difficult  to  develop  any  testing  procedure 
to  verify  whether  the  mean  square  error  is  reduced  or  not  by  the 
imposition  of  the  constraint. 

4.2  Linear  Equality  Constraints 

Another  possible  type  of  equality  constraint  that  can  be  im- 
posed over  the  solution  of  the  restoration  problem  is  the  linear  equali- 
ty constraint.  This  could  be  derived  from  some  a priori  knowledge 
that  the  analyst  has  about  relations  involving  linear  combination  of 
pixel  values.  Examples  could  be  the  specification  of  individual  pixel 
values,  of  ratios  of  the  values  of  some  pixels,  or  the  sum  of  part  or 
all  of  the  pixels,  representing  the  integral  in  the  discrete  form  of  the 
image  as  measured  by  a photocell.  Another  alternative  would  be 
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presented  when  more  than  one  image  of  the  same  object  is  available. 
In  this  case,  if  the  blur  functions  are  supposed  to  be  known,  the 
specification  of  blurred  pixel  values  on  additional  pictures  would 
represent  linear  constraints  to  be  met.  The  value  of  these  con- 
straints will  depend  on  the  amount  of  uncertainty  represented  by  noise 
in  the  additional  images. 

Suppose  that  the  usual  overdetermined  linear  model  for  res- 
toration is  adopted.  The  covariance  matrix  of  the  noise  is  V, 
assumed  to  be  positive  definite.  The  set  of  linear  constraints 

Ax  = 1 (4.2-1) 

2 2 

is  imposed,  where  A is  a (P  x N ) matrix  of  rank  P < N , xis  the 

constrained  estimator,  and  t_  is  a (P  x 1)  known  vector. 

T - 1 

The  minimization  of  (y  -Hx)  V (X"Hx)  can  be  carried  out 
using  standard  Lagrangean  techniques.  The  optimal  estimator  will 
be  given  by  [3-1,  page  100} 

x = x + (HV  lH)  " 1 AT  [ A(HV  ^ " 1AT  ] _ 1 (1-Ax)  (4.2-2) 

where  ft  is  the  unconstrained  solution,  expressed  by 
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and  the  covariance  matrix  of  the  optimal  solution  is 

Vx  = 1h)-1  - (h\‘1h)"1aT[a(htv"1h)at]"1a(htv‘1h)"1 

(4.2-4) 

T -1  -1 

-x  “ ® Y H)  is  the  (positive  definite)  covariance  matrix  of  the 
unrestricted  estimator . The  second  matrix  cn  the  expression  of 
V~  can  be  shown  to  h«*  nor  negative  definite  with  rank  P.  Therefore, 
V~  is  equal  to  minus  a nonnegative  definite  matrix.  As  a con- 
sequence, each  diagonal  jlement  of  is  less  than  or  equal  to  the 
corresponding  element  of  V,.  Thus,  there  is  a reduction  in  the 
variance  of  each  component  of  the  constrained  estimator  vector  as 
compared  to  the  unconstrained  one.  However,  this  should  not  imply 
that  the  former  is  necessarily  better  than  the  latter.  In  fact,  the 
constrained  estimator  may  present  bias,  as  opposed  to  the  unbiased- 
ness of  the  unconstrained  estimator.  The  bias  of  the  constrained 
solution  is  given  by 

x - (HTY_1H)"1AT[ArHTv"1H)"1AT][i- Ax]  (4.2-5) 

This  bias  will  be  zero  if  and  only  if  the  specifications  are  satisfied 
by  the  true  vector  x,  i.  e. , if  Ax  = t.  In  this  case,  the  set  of  con- 
straints could  be  regarded  as  additional  noise  free  observations.  The 
following  result  obtained  by  Theil  [4-1,  pages  536-538]  follows 


naturally.  It  states  that,  if  the  restrictions  given  by  (4.  2-1)  are 
satisfied  by  the  true  parameter  vector  x,  then  the  equality  constrained 
estimator  x is  B.L.U.E.  of  x in  the  sense  of  giving  the  minimum 
variance  among  the  class  of  all  unbiased  estimators  that  are  linear 

in  and_t. 

The  reduction  of  variance  comes  from  the  fact  that  the  solu- 
tion x should  lie  in  a smaller  dimensional  space.  For  example,  as  an 
extreme  case,  if  the  linear  system  (4.2-1)  has  a unique  solution,  the 
variance  of  the  solution  will  be  zero.  However,  there  will  be  bias  if 

die  true  picture  is  not  this  solution  vector. 

Like  in  the  quadratic  equality  constrained  methods,  the  amount 

of  bias  is  unknown,  because  the  true  vector  of  pixel  values  is  not 
accessible.  A measure  of  the  quality  of  the  constrained  estimator 
should  take  into  consideration  both  bias  and  variance.  A possible 
measure  could  be  given  by  the  mean  square  error  matrix,  defined  by 
E(x  - x)(x  - x)T.  The  circumstance  that  the  constrains  are  linear 
opens  up  the  possibility  of  developing  a statistical  test  procedure  to 
verify  whether  or  not  there  is  a reduction  in  the  mean  square  error 
by  the  imposition  of  the  constraints.  This  is  done  by  testing  whether 

or  not  the  hypothesis  that  the  matrix  E(£  - x)(x  - x)  - E(x  - x) 

(£  - x)T  is  positive  semidefinite  is  true.  This  procedure  is  due  to 
Toro  Vizcar rondo  and  Wallace  T4-2].  It  makes  use  of  the  F -statistic 
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described  in  chapter  3.  The  improvement  in  the  mean  square  error 
by  the  introduction  of  the  equality  constraint  can  be  expressed  by  the 
fict  that  the  noncentrality  parameter  X of  the  test  statistic  is  lf-.ss 
chan  or  equal  to  one-half.  Observe  that  the  test  to  verify  whether  or 
not  the  linear  restrictions  are  true  checks  whethe.*  or  not  this  non- 
centrality parameter  is  zero.  In  the  MSE  test  one  is  not  concerned 
whether  or  not  the  linear  restrictions  on  the  pixel  values  are  true, 
but  whether  or  not  the  imposition  of  these  constraints  represents  an 
improvement  in  MSE.  One  problem  that  occurs  in  the  application  of 
the  Toro  Vizcarrondo -Wallace  test  is  that  the  decision  regions  have 
been  tabulated  so  far  for  only  one  linear  constraint.  In  this  case  the 
experimenter  can  always  perform  the  F-test  for  linear  hypothesis, 
which  has  been  tabulated  for  all  degrees  of  freedom  and  uses  the 
same  statistics  as  the  Toro  Vizcarrondo-Wallace  test.  This  will  not 
tell  whether  there  is  an  improvement  in  mean  square  error,  but 
whether  or  not  the  imposed  linear  restrictions  are  satisfied  by  the 
true  parameter  vector. 

The  choire  of  the  linear  constraint  to  he  imposed  should  be 
judged  by  two  factors.  The  first  one  is  the  knowledge,  coming  from 
a priori  considerations,  that  the  relationship  is  true  or  at  least  ap- 
proximately true  so  that  an  excessive  bias  will  not  be  introduced  in 
the  answers.  This  is  important  when  the  linear  relationships  are 
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subject  to  random  error,  as  is  the  case  when  they  come  from  the 
observation  of  pixel  values  in  another  blurred  and  noisy  image  of  the 
same  object.  The  second  factor  that  the  experimenter  should  have  in 
mind  is  that  a linear  relationship  tends  to  be  more  effective  in  re- 
ducing variance  if  it  is  in  the  direction  of  smaller  axes  of  the  K- 

ellipsoid  rather  than  the  larger  ones.  This  can  also  be  expressed  by 

T 

the  fact  that  the  vector  a in  the  linear  restriction  a^x  = tp,  p 

” T -1 

lf  P should  be  in  the  direction  of  the  eigenvectors  of  H C H 

corresponding  to  the  smallest  eigenvalues.  These  retirements  may 
not  be  very  easy  to  conciliate  in  practice.  Nevertheless,  the  MSE 
test  provides  a tool  to  verify  whether  or  not  linear  equality  restric- 
tions should  be  used  in  the  restoration. 

Nonexact  linear  constraints  involving  pixel  values  can  also  oe 
incorporated  in  another  way,  if  the  uncertainty  can  be  modeled  by  a 
random  process,  with  a known  covariance  matrix.  An  example  would 
be  the  use  of  an  additional  blurred  and  noisy  image  of  the  same  ob- 
ject. Suppose  that  the  uncertain  linear  constraints  are  expressed  by 

t 

t_  = Ax  + v (4.2-6) 

where  the  covariance  matrix  of  v is  Tand  v is  assumed  to  be  indepen- 
dent of  the  n,  for  simplification.  The  combination  of  the  sample  inf  or 
mation  expressed  by  (4.  1-1)  and  a priori  information  given  by  (4.  2-5) 


can  be  accomplished  through  the  model 


1 

- 

“ 1 
H 

x + 

n 

_t 

A 

V 

. — 

— 

Under  this  model,  the  estimator  x is  given  by  (using  (5.2-3)) 


The  covariance  matrix  of  the  new  estimator  is  easily  obtained  as 


C£  = (HTV“1H  + ATT_1A)  1 (4.2-8) 

- x 

This  technique  of  incorporating  random  linear  constraints  in 
regression  is  known  in  econometrics  as  mixed  estimation  [4-3  and 
4-4].  The  connection  between  this  procedure  and  the  Bayesian  ap- 
proach is  strong.  In  fact,  suppose  that  t = 0andA  = I.  This  is 
equivalent  to  stating  that  the  vector  of  pixel  values  has  an  priori  dis- 
tribution with  zero  mean  and  covariance  matrix  V.  Under  these  con 


ditions,  expression  (4.2-7)  reduces  to 
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x = (HTV_1H + T'V1HTV'1Y  (4.2-9 

which  is  the  familiar  result  of  the  Wiener  filter. 

Observe  at  this  point  that  the  linear  constrained  or  mixed 
estimator  procedures  can  make  use  of  Fourier  techniques  if  the 
matrices  V and  T are  of  the  form  A and  H and  A represent  convo- 
lution operations.  This  would  be  the  case  when  the  constraints  are 
represented  by  an  additional  picture  and  the  blur  in  the  observed 

Picture  as  well  as  in  the  additional  picture  is  space  invariant  and  in 
both  pictures  the  noise  is  white. 

The  extension  of  the  mixed  estimation  technique  to  multi- 
exposure of  the  same  object  for  more  than  two  pictures  is  straight- 
forward. Given  K independent  observations 


*k  = M+^k 


(4.2-10) 


with  the  covariance  matrix  of  n being  V the  B.  L.U.E. 


estimator 


x is  obtained  as 


K T -1 

£ Bk  \ 

Lk=l  K k ^ 


1 


(4.2-11) 


and  the  covariance  matrix  of  the  estimator  is 
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c 


(4.2-12) 


So  far  the  discussion  on  linear  constraints  has  been  restricted 
to  the  overdetermined  model  for  image  restoration.  When  the  blur 
matrix  H has  any  rank,  the  analysis  of  the  equality  constrained  least 
squares  problem  can  be  carried  out  by  the  following  procedure. 
Assuming  that  the  vector  t_  in  (4.  2-1)  is  in  the  range  of  A,  the  general 
solution  of  this  equation  is  expressed  by 


xq  = A^t  + (I  - A+A)u 


(4.2-13) 


for  some  vector  u.  Observe  that  the  system  (4.2-1)  should  be  under  - 
determined  if  the  equality  constraints  do  not  involve  any  randomness, 
otherwise  these  restrictions  would  determine  the  solution  by  them- 
selves, irrespective  of  what  the  observed  blurred  values  are.  Under 
white  noise,  the  solution  minimizes  the  norm  of  the  vector  (x_Hx) 
over  the  set  expressed  by  (4.2-13).  Therefore,  the  problem  has  been 
transferred  to 


min  ||  y.  " Hu  j|  (4.2-14) 

u 


where 
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1.  ~ Y.  ■ HA.4^  (4.2-15) 

and 

H = H(I_-A+A)  (4.2-16) 


The  general  solution  of  (4.2-14)  is  given  by 


u 

“O 


= H +x  + (I_-HH+)z 


(4.2-17) 


for  some  vector  z.  Substituting  this  value  of  u into  expression 

— —X) 

(4.2-13),  the  solution  for  x is  obtained  as 
x = A+t  + (I-A+A) 

_Q  — — “ 


+ (I_-HH+)z 


(4.2-18) 


By  observing  that 


— T T f 

H (HH  ) 


(4.2-19) 


and  also  that 


(I- A+A)2  = (I_-A+A)  = (I-AV 


(4.2-20) 


which  comes  from  the  fact  that  (I_-A  A)  is  a projection  matrix  (onto 
N(A)),  equality  (4.2-18)  can  be  expressed  as 


xq  = A+_t  + H4^  + (I-A+A)(I_-H4H)z 


(4.2-21) 
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iOr  some  vector  z_.  In  analogy  with  the  unconstrained  case,  it  can 
be  shown  T 3 -3,  page  32]  that 

x = A+_t  + H+x  (4.2-22) 

is  the  minimum  norm  solution. 

J 

Expression  (4.2-18)  (or  (4.2-21))  gives  the  general  form  of 
any  solution  to  the  linear  constrained  restoration  problem.  This 
solution  will  be  unique  if  and  only  if  H-H^H)  is  the  zero  matrix  or, 
equivalently , if  and  only  if  the  null  space  of  H,  N(H),is  the  zero  vector. 
The  constraints  substitute  the  matrix  H for  the  matrix  H for  the  de- 
termination of  the  uniqueness  of  the  solution.  By  the  definition 
(4.2-16)  and  by  the  observation  that  (I-A+A)  projects  any  vector 
onto  N(A),  this  necessary  and  sufficient  condition  can  also  be  expres- 
sed by  the  condition 


N(H)  n N(A)  = 0 (4.2-23) 

where  ()  is  the  zero  vector.  This  can  also  be  viewed  from  another 
perspective!  one  can  always  add  a vector  lying  anywhere  on  N(H)  to 
the  solution  of  the  unconstrained  restoration  problem  and  a vector 
iying  anywhere  on  N(A)  tc  the  solution  of  the  linear  system  of  equality 
constraints.  When  the  two  systems  are  solved  together,  the  solution 
will  clearly  be  unique  if  their  intersection  contains  a single  vector, 
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which  is  necessarily  the  zero  vector.  Therefore,  the  use  of  linear 
equality  constraints  can  transform  the  nonunique  solution  of  the 
underdetermined  restoration  problem  into  a unique  solution  if  the 
constraints  compensate  for  the  lack  of  information  of  the  sample. 
This  transforms  the  nonestimable  vector  jcinto  an  estimable  one. 

The  imposition  of  linear  equality  constraints  may  still  not 
guarantee  the  uniqueness  and  estimability  of  the  solution  vector  if 
condition  (4.2-23)  is  not  satisfied.  Nevertheless,  the  constraints 
may  transform  previously  inestimable  parametric  functions  of  pixel 
values  into  estimable  functions.  In  fact,  in  the  unconstrained  case, 
the  only  estimable  functions  cTx  were  those  such  that  the  vector  c 
was  orthogonal  to  N(H).  Wit!  the  equality  constraints  the  set  of  esti- 
mable functions  comprises  all  those  such  that  £ is  orthogonal  to  the 
intersection  of  N(H)  and  N(A).  The  latter  set  clearly  contains  the 
former  one. 

4.  3 Inequality  Constraints 

In  the  last  section  the  a priori  knowledge  involving  linear  com- 
binations of  pixel  values  was  analyzed  and  used  in  the  restoration  of 
blurred  images  corrupted  by  noise.  However,  quite  often  the  ex- 
perimenter has  a priori  information  in  the  form  of  inequality  con- 
straints involving  the  pixel  values.  This  is  particularly  true  in 
image  processing.  In  fact,  the  physics  of  image  formation  determine 
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that  pixel  values  should  be  nonnegative  quantities.  Furthermore,  an 
upper  bound  on  these  values  is  often  known,  as  is  the  case  when  the 
images  are  digitized  and  a finite  number  of  bits  is  assigned  to  each 
pixel.  The  analyst  may  also  want  to  combine  equality  and  inequality 
constraints  in  the  restoration  model.  It  will  be  shown  in  the  following 
that  if  these  constraint,?  are  linear  and  if  a squared  error  is  used  as 
a criterion,  a tractable  mathematical  model  is  developed,  leading  to 
considerably  improved  restoration  results. 

Suppose  that  the  linear  model 

= Hx  + n (4.3-1) 

is  adopted  for  the  blurring  process  and  the  corruption  by  noise.  As 

before,  H is  an  (M2  x N2)  matrix  and_y,  x and  n are  vectors  with  com- 

2 

patible  dimensions.  Th  j rank  of  the  H matrix  is  R.  If  R - N or 
R < N2,  the  model  will  be  overdetermined  or  under  deter  mined, 
respectively.  The  covariance  matrix  of  the  noise  will  be  assumed  to 
be_V.  The  constraints  will  be  expressed  by 

Ax  = 1 (4-3-2) 

and 


0 < x < u 


(4.3-3) 
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2 2 
where  A is  an  (S  x N ) matrix  of  rank  S < N . 

Under  the  least  squares  criterion  the  objective  function  to  be 
minimized  is  given  by 

(X-Hx)T  V-1(y-Hx)  (4.3-4) 

The  minimization  of  (4.  3-4)  subject  to  the  constraints  (4.3-2)  and 
(4.  3-3)  may  also  be  obtained  if  the  vector  x is  supposed  to  be  random, 
with  a uniform  distribution  in  the  region  defined  by  the  constraints, 
and  gaussian  noise  corrupts  the  image.  Under  the  maximum  a 
posteriori  (MAP)  estimation  criterion  (or  maximum  likelihood,  since 
p(x)  is  a constant  in  the  interval),  one  looks  for  the  vector  x such  that 
p(x|i)  p(x|y;)  for  any  x.  Using  the  fact  that  the  logarithm  is  a mono- 
tonic increasing  function  and  also  the  gaussian  assumption  on  the  noise 
it  is  equivalent  to  maximize  the  function 

log  p(x|x)  = log  p(x)  - j (y  -Hx)V~1(y-Hx)  - log  p(^) 

Since  p(x)  is  constant  within  the  constraints,  the  maximization  of 
log  p(x|x)  leads  to  the  minimization  of  the  quadratic  form  (4.3-4) 
subject  to  the  linear  constraints  (4.  3-2)  and  (4.  3-3). 

In  order  to  obtain  the  quadratic  programming  problem  in 
standard  form,  some  manipulation  is  necessary.  Define  a slack 
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vector  £ with  nounegative  entries  such  that 


(4.3-5) 


Also,  introduce  a new  matrix  defined  by  the  expression 


(4.3-7) 


or 


where 


(4.3-8) 


$ 

If  a matrix  H is  defined  as 


H*  = (H  C) 


(4.3-9) 


The  linear  model  for  restoration  can  then  be  expressed  as 
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* * 

y_  = H x + n 


And  the  objective  function  of  the  least  squares  is 


* 

(x-IT 


- 1 * 

v (x-H 


* 

x ) 


(4.3-10) 


(4.3-11) 


subject  to  the  constraints 

Bx*  = v (4.3-12) 

and 

x*  ;»  0 (4.3-13) 

The  previous  equations  express  the  standard  form  of  a quadratic 
programming  problem,  namely,  the  minimization  of  a quadratic  form 
subject  to  linear  constraints  on  the  variables. 

The  necessary  and  sufficient  condition  that  the  solution  of  such 
a problem  satisfies  is  expressed  by  the  Kuhn  Tucker  Theorem  C4-5, 
page  233].  In  the  particular  situation  of  a quadratic  objective  func- 
tion, this  theorem  can  be  expressed  by  the  following  conditions 

a)  x is  feasible,  that  is  equations  (4.3-12)  and  (4.3-13)  hold, 

b)  there  exist  vectors  u > £ and  w such  that 

2H*TV_1H*x*  - u + B*Tw  - 2H*TV  = 0 (4.3-14) 
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c)  the  vector  u and  the  optimal  solution  are  such  that 

T * 

H x = 0 (4.3-15) 

The  Kuhn  Tucker  conditions  for  quadratic  programming  require  the 

% 

existence  of  vectors  x , u,  and  w satisfying  linear  equalities  and  in  - 
equalities  plus  the  additional  condition  that  u x = 0.  This  suggests 
the  use  of  a method  very  similar  to  the  one  that  verifies  the  existence 

of  a feasible  solution  to  initialize  the  simplex  method  in  linear  pro- 

T * 

gramming.  The  condition  ii  x = 0 is  nonlinear,  however,  and  re- 
quires a modification  of  the  usual  procedure. 

Thi3  is  the  basis  of  the  algorithms  for  quadratic  programming 
that  rely  on  the  simplex  method  for  linear  programming.  Two  of  the 
main  procedures  are  Wolfe's  [4-6]  and  Dantzig's  £4-7]  algorithms. 
The  former  has  two  versions,  a short  form  and  a long  form.  The 
first  one  is  capable  of  handling  only  positive  definite  quadratic  forms 
(which  occurs  when  the  model  is  overdetermined),  but  the  second  one 
can  also  deal  with  positive  definite  forms  (for  the  underdetermined 
case).  There  are  several  other  algorithms  available  for  quadratic 
programming  problems  [4-8]  . In  recent  years  a research  effort  has 
been  under  way  towards  developing  numerically  stable  methods  [4-9, 
4-10,  4-11,  and  4-12]  to  solve  this  important  problem. 

The  method  developed  so  far  to  sob  e the  linear  inequality 
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constrained  restoration  problem  w?  s directed  toward  the  estimation 
of  the  image  itself.  Quite  often  one  would  be  interested  in  estimating 
parametric  functions  of  pixel  values.  If  this  is  the  case,  a reformu- 
lation of  the  problem  is  convenient.  This  will  lead  to  the  calculation 
of  confidence  intervals  for  the  constrained  problem. 

In  chapter  3,  the  computation  of  the  confidence  intervals  was 
made  by  considering  the  K -ellipsoids  and  the  support  planes  ortho- 
gonal to  the  vector  c_  that  wa?  used  to  calculate  the  parametric  func- 
tion cx.  In  the  overdetermined  model,  if  the  vector  £ is  parallel  to 

some  axis  of  the  ellipsoid  corresponding  to  a large  eigenvalue  of 
T -1 

H V H,  the  confidence  interval  would  be  small  compared  to  the  case 

when  (C  is  parallel  to  an  axis  corresponding  to  a small  eigenvalue.  In 

T - 1 

the  under  deter  mined  model  some  eigenvalues  of  H V H are  zero  arid 
the  confidence  interval  for  some  parametric  functions  is  infinite, 
which  corresponds  to  the  fact  that  these  functions  are  inestimable. 

The  a priori  knowledge  involving  inequality  constraints  may 
change  this  situation  considerably.  The  restrictions  0 < x <u  may 
bound  the  elongated  (in  the  ill  conditioned  case/  or  degenerate  (in  the 
underdeter  mined  case)  ellipsoids,  reducing  the  confidence  interval  in 
the  former  case  and  transforming  inestimable  functions  into  esti- 
mable functions  in  the  latter  one.  It  io  also  clear  that  the  estimation 
of  the  vector  x of  pixel  values  may  be  improved,  with  the  bounding  of 
the  ellipsoids  by  the  hyperplanes.  The  ill  conditioning  and  the 
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underdeterminacy  problems  may  be  solved,  at  least  from  the  sta- 
tistical point  of  view,  with  the  use  of  linear  inequality  constraints. 

In  order  to  compute  the  confidence  intervals  for  the  inequality 
constrained  case,  it  is  necessary  to  introduce  the  idea  of  a confidence 
ellipsoid  for  a multidimensional  distribution.  The  assumption  of 
gaussian  noise  will  be  made  throughout  the  discussion. 

For  a given  estimator  xof  x,  the  100.  a % confidence  interval 
of  x is  the  ellipsoid  in  the  x-space  with  center  in  £and  given  by  the 
expression 

(x-x)THTV_1H(x-x)  = Y2  (4.3-16) 

2 

where  Y is  selected  such  that 

Pr  { (* -x)THTV_1H(x-x)  <jY2}  = a (4.3-17) 

2 

The  value  of  y can  be  computed  by  observing  that  the  quadratic  form 
in  (4.  3-17)  is  distributed  according  to  a X -di s tr ibution  with  r de- 
grees of  freedom,  where  r is  the  rank  of  the  blur  matrix  H.  This 
means  that  [3-2] 
y2 

a=J  [r(p/2)Zf  /2^j  " 1 pr/2_1  . exp(-p/2)dp  (4.3-18) 

o 

where  F(«)  is  the  gamma  function  defined  by 

00 

r(p)  = Jo  yp  1 e y dy 


(4.3-19) 
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Now,  suppose  that  the  constraint  x £ 0 is  imposed  on  the  estimator. 
This  means  that,  with  probability  one,  x will  be  in  the  positive  quad- 
rant. Without  this  constraint,  x is  guaranteed  to  be  in  the  Y - ellip- 
soid a%  of  the  time.  Therefore,  with  the  imposition  of  the  constraint, 
the  estimator  x will  be  in  the  intersection  of  the  Y- ellipsoid  and  the 
positive  quadrant  100.  a % of  the  time. 

T 

Consider  now  the  parametric  function  £ x and  the  planes  ob- 
tained by  setting  this  function  equal  to  a given  constant.  There  are 
two  support  planes  of  the  region  mentioned  in  the  previous  Paragraph. 
Figure  (4.3-1)  taken  from  [3-2,  page  203],  illustrates  the  as  sertion 
for  the  case  where  the  vector  x has  two  components.  The  two  support 
planes  will  be  denoted  by 

S_  = [x (cTx  = 0^  } (4.3-20) 

s+  = {xJcTx  = 0EU]  (4.3-21) 

x is  guaranteed  to  lie  between  S+  and  S with  a probability  of  100a%. 
The  interval  1^(0)  = j^0E  , 0EJ  a confidence  interval  for  0. 

The  interval  1^,(0)  can  also  be  expressed  as 


y«>  * 


min  l£Tx|  (x-x)THTV  *H(x-x)  £ V2  \ , 

[x>0  J 

max  {cTx|  (x-x)THTy_1H(x  x)  5 (4.3-2 

x>0  J 


or  as 


y«>  * 


min  I cj’ x | (y-Hx)^V  *(y-Hx)  < r + 1 

x>0  ° ' * 

max  {cTx | (£-Hx)TV  ^^-Hx)  < r + y^  }](4.  3-23) 


x>  0 


where  r^  is  the  minimum  value  assumed  by  the  quadratic  residual 
expression. 

Since  the  minimum  or  the  maximum  of  the  linear  function  can 
occur  only  at  the  boundary  of  the  ellipsoid,  the  expression  for  the 
confidence  interval  can  also  be  written  as 


y«>  * 


min  |c.Tx|  (£-Hx)^V  * (^-Hx)  = 
x>£ 


'o  + V } . 


max  [cTx|  (y.-Hx)TV  ^^-Hx)  = r + y2  | (4.3-24) 
x >y  ° 


As  pointed  out  by  Rust  and  Burrus  [3-2,  page  168],  the  two 
optimization  problems  that  define  the  confidence  interval  can  be  for- 


114 


mulated  in  a somehow  dual  way  by  making  the  following  observation: 
the  end  points  of  the  confidence  interval  are  the  points  at  which  the 
two  support  planes  make  contact  with  the  lowest  level  contours  in  the 
positive  orthant.  Figure  (4.3-2)  illustrates  this  point. 

The  inverted  problems  have  the  form 

ro  + Y2=min  - Hx)TV"  ^ - Hx)  I cTx  = } (4.3-25) 

x > 0_ 

rQ  + V 2 = min  {(£  - Hx)TV^(y  - Hx)  | £Tx  = 0U  } (4.3-26) 

x >£ 

This  formulation  now  leads  to  quadratic  programming  problems,  with 
the  role  of  the  constraints  and  the  objective  function  reversed. 

The  calculation  of  the  confidence  interval  is  facilitated  by  the 
construction  of  a curve  expressed  by 

T (0)  = min  {(z-Hx)TV_1(j:-Hx)|  £Tx  = <t  } (4.3-27) 

x>0 

Figure  (4.3-3)  illustrates  the  curve  given  by  (4.3-27),  for  fhe  case  of 

Figure  (4.  3-2).  The  bottom  of  the  curve  is  in  parabolic  form,  show- 

2 

ing  that  for  small  values  of  y the  inequality  constraints  »re  not 

2 

being  inforced.  When  y increases,  however,  these  constraints 
start  becoming  effective  and  the  curve  rises  steeper  than  the  para- 
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Figure  (4.3-2)  The  Determination  of  Contact  Points 
for  Confidence  Intervals 
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Figure  (4.  3-3)  The  Curve  for  Determination 
of  the  Confidence  Intervals 
for  Linear  Inequality  Constrained  Restoration 
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2 2 

bola.  For  a given  value  of  Y given  by  the  tables  of  X distribution, 

the  value  F(jl)  = r + Y determines  two  values  on  the  abscissa  of 
o 

Figure  (4.  3-3),  $ and  0U  , that  are  the  extremes  the  confidence  in- 
terval for  thii  parametric  function  of  pixel  values.  The  construction 
of  the  curve  can  be  accomplished  by  the  procedure  of  imposing  sim- 
ultaneously the  inequality  constraint  x > 0 and  the  equality  constraint 
T 

c x = 0 for  several  values  of  0. 

An  observation  should  be  made  at  this  point.  The  uncon- 
strained confidence  intervals,  developed  in  chapter  3,  and  making  use 
of  the  K-ellipsoids,  are  optimum  in  the  sense  of  giving  the  shortest 
possible  confidence  interval  for  a given  confidence  level.  No  claim 
of  optimality  is  made  for  the  constrained  confidence  intervals  using 
quadratic  programming  and  based  on  the  Y -ellipsoids.  In  fact,  they 
may  be  very  pessimistic,  particularly  for  large  rank  of  the  blur 
matrix  H,  when  the  ratio  Y /K  becomes  large.  Further  research  is 
needed  in  order  to  obtain  tighter  intervals. 

It  should  be  remarked  that  if  the  Y -ellipsoid  is  centered  out- 
side the  inequality  constraints,  the  computation  of  the  confidence 
intervals  for  small  levels  of  confidence  may  not  be  possible.  In  this 
case  one  possible  solution  consists  of  replacing  the  constraint 

(X  - Hx)TV  *(X"Hx)  = r0  + ^ 


(4.3-28) 


by 


where 


(X-Hx)^'1^ -Hx)  = + Y2 
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(4.3-29) 


ro  “ min  (X  - Hx)TV_1(z-Hx)  > r 
x > 0 o 


(4.3-30) 


In  the  preceding  discussion  of  confidence  intervals  the  assump- 
tion of  gaussian  noise  has  been  made.  As  a consequence,  the  Y-elllp- 
soids  contain  a given  percentage  of  this  multidimensional  distribution. 
An  entirely  similar  analysis  can  be  carried  out  when  these  ellipsoid, 
contain  the  whole  set  of  possible  values  for  the  estimator.  This  would 
be  the  case  when  the  noise  is  distributed  in  some  fashion,  uniformly 
°r  not,  within  the  bounds  of  an  ellipsoid.  In  this  case,  the  computed 
intervals  give  the  minimum  and  maximum  values  that  a parametric 
function  can  assume,  with  probability  one.  This  bounded  distribution 
does  not  have  to  be  restricted  to  the  ellipsoid  shape.  Suppose  that  the 
noise  components  satisfy  the  linear  constraint 


M 

s hw 
1=1 


(4.3-31) 
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where  s^  are  positive  quantities.  This  equation  defines  the  shape  of 
a diamond  in  a higher  dimensional  space.  In  analogy  to  expression 
(4.  3-24),  the  lower  and  upper  bounds  of  the  inequality  restricted  con- 
fidence interval  are  given  by 


and 


*1  f T I M 1^  "--*i  I 2 "i 

0 = min  c x | £ £ 4 J 


x>0 


i = l 


(4.3-32) 


*u  f T 

0 = max  j £ x 

x>  0 


T I M 


£ 

i = l 


4 


} 


(4. 3-33) 


The  two  previous  equations  clearly  define  linear  programming  prob- 
lems, namely,  the  minimization  (or  maximization)  of  a linear  function 
subject  to  linear  constraints. 

Another  type  of  bounded  distribution  that  leads  to  linear  pro- 
gramming is  the  rectangular  distribution,  defined  by 

^ U2  (4.3-34) 

The  confidence  interval  will  be  given  by 


max 


1 <i<  M 
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0^  = min  {c^x  j 


x>0 


max  — J — £ a f (4.3-36) 

Kj<M2  "j 


With  some  manipulation  it  is  possible  to  make  it  clear  T3-5,  page  981 
that  the  problems  defined  above  are  linear  programming  problems. 

In  fact,  equation  (4.  3-34)  is  equivalent  to  the  statement  that 


hi 


a. 

l 


2 2 
^ u i = 1,  2,  ...  , M 


(4.3-37) 


or,  by  making  u equal  to  unity 


-si  * £ a.  i = 1,  2,  ..  .,  M* 


This  can  be  expressed  as 


(i?x)j  * Yj  + Sj 


and 


(4. 3-38) 


(4.3-39) 


-(HxJj  * -y.  + s. 


(4.3-40) 


or,  in  matrix  form 


Hx  ^ ^ + s_ 


(4.  3-41' 


and 


-Hx  £ -v  + 


X + a. 


(4.3-42) 


i 
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t T 

where  js  = (s  . , s , • • • , s _)  . 

2 M2 

2 2 

Defining  A to  be  a 2M  x N matrix  expressed  by 


A = 


(4.3-43) 


and  l the  2M  x 1 vector 


i = 


“X 


W 


(4.3-44) 


The  previous  inequalities  are  expressed  as 


Ax  £ i 


(4.3-45) 


so  that  the  confidence  interval  can  be  given  by 


min  jc.  x I Ax  \ 
x>0  1 J 


(4.3-46) 


and 


0U  = max  |cTx  j Ax  £ | 

x > 0 


(4.3-47) 


The  last  two  expressions  are  clearly  linear  programming  formula- 


tions. 
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Linear  programming  is  also  used  in  two  other  possible  for- 
mulations of  the  restoration  problem,  for  the  calculation  of  the  opti- 
mal estimator  x,  instead  >f  linear  functionals, as  in  the  previous 
derivation.  As  the  first  example,  suppose  that  the  noise  components 
are  independent  and  identically  distributed  according  to  an  exponen- 
tial distribution 


P(ni}  = 2 GXP  {'|nil}  (4.3-48) 

Assuming  the  vector  x to  be  random  and  uniformly  distributed 
within  the  constraints  expressed  by  (4.  3-2)  and  (4.  3-3)  and  adopting 
either  the  criterion  of  MAP  estimation  or  maximum  likelihood,  the 
following  expression  is  to  be  minimized 

M2 

Q(x)  = £ \Y.  ~ (Hx)  I (4.3-49) 

i-1  1 1 

Observe  that  the  same  objective  function  would  have  been  obtained 
through  the  criterion  of  estimation  by  least  sum  of  absolute  deviations 
under  linear  equality  and  inequality  constraints. 

This  problem  can  be  formulated  in  terms  of  linear  program- 
ming by  observing  [4-13]  that  the  objective  function  can  be  expressed 


as 
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min  y) 
i=l 


+ 


(4.3-50) 


such  that 


(Hx)j 


i 


1.  2, 


M 


(4.3-51) 


and 


e,  » e SO  i = 1,  2,  . , M2 

*2 


(4.3-52) 


As  a second  example  of  the  use  of  linear  programming  in  re- 
gression analysis,  suppose  that  the  objective  is  to  minimize  the  maxi- 
mum absolute  derivation  in  the  regression  model  (Chebyschev  cri- 
terion). Then  one  seeks 


min  (max  | - (Hx)^)  (4.3-53) 

- l<i<M2 

The  reduction  to  a linear  programming  structure  [4-13]  can  be  done 
by  expressing  the  previous  objective  function  as 

mine  (4.3-54) 


such  that 
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-e  * y.  - (HxJj  * e i = 1,  2,  . . . , M2  (4.3-55) 

The  use  of  inequality  constraints  in  image  restoration  also  allows  for 
the  incorporation  of  a priori  knowledge  concerning  the  variation  of 
the  function  that  defines  the  original  picture.  This  can  be  done  by  a 
change  of  variables  f 3 -2,  pages  112-115  and  190-193]  . Consider  the 
one -dimensional  case  first.  If  the  solution  is  known  to  be  monotoni- 
cally  increasing  and  positive,  this  can  be  imposed  by  expressing  it 
in  terms  of  an  integral  of  a positive  function  with  a positive  initial 
condition.  In  discrete  form,  this  corresponds  to  the  change  of  vari- 
ables 


X = R.£,  £ 2:  £ 

and  R.  in  lower  triangular  form 

0 • • • 0 

1 • • . 0 

• • 

• . 

• • 

1 . . . 1 


(4.3-56) 


(4.3-57) 


If  the  solution  is  increasing,  but  not  necessarily  positive,  the  initial 
condition  can  assume  any  value,  so  it  can  be  expressed  by  a difference 
of  two  positive  quantities.  In  this  case  the  matrix  R takes  the  form 
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1-1  0 0 • • • 0 

1 -1  1 0 ...  o 


.... 


1-111 


(4.3-58) 


If  the  function  is  known  to  be  positive  and  convex,  it  is  enough 
to  express  it  as  a double  integral  of  a positive  function  with  a positive 
initial  condition.  In  matrix  form  this  is  done  by  expressing 


x = R£  g_  ^ 0 


where 


(4.3-60) 


R = ST  (4.3-61) 

where  S and  T are  matrices  with  the  form  expressed  by  (4.  3-57). 

If  the  solution  is  positive  and  known  to  possess  some  degree 
of  smoothness,  this  can  be  subjectively  incorporated  by  expressing 
x as  a positive  linear  combination  of  N vectors  that  have  positive 
components,  are  linearly  independent  and  have  some  smooth  variation 
in  their  components.  Reference  [3-2,  page  114]  suggests  the  use  of 
vectors  whose  components  form  a triangular  function  and  are  shifted 
from  vector  to  vector. 
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The  application  of  these  techniques  to  the  two-dimensional 
problems  that  occur  in  image  processing  is  easily  accomplished  by 
the  use  of  vector  notation.  If  the  experimenter  feels  that  the  varia- 
tion of  some  row  or  some  column  in  the  picture  has  one  of  the  prop- 
erties described  in  the  preceding  paragraphs,  this  can  be  done  by 
substituting  the  variables  corresponding  to  the  row  or  column  by  the 
appropriate  transformation.  The  matrix  R will  be  diagonal  with 
elements  equal  to  one  everywhere  in  the  diagonal  except  in  the  ele- 
ments that  are  transformed.  Once  this  is  done,  both  the  mathe- 
matical programming  problems  of  restoring  the  image  itself  or  cal- 
culating approximate  confidence  intervals  for  parametric  functions 
can  be  solved  in  terms  of  the  new  vector  a,  with  a new  blur  matrix 
H - H . R . When  the  problem  is  solved  in  terms  of  £,  the  trans- 
formation x = Ra  is  used  to  obtain  the  desired  solution. 

The  discussion  on  inequality  constrained  restoration  will  now 
be  concentrated  on  the  problem  of  the  calculation  of  the  sampling  dis- 
tribution of  the  estimators.  The  Imposition  of  inequality  constraints 
affects  considerably  that  distribution.  While  in  the  unconstrained  or 
linear  equality  constrained  restoration,  under  the  gaussian  hypothe- 
sis, the  estimators  would  still  be  normally  distributed,  in  the  case  of 
inequality  constraints,  the  distribution  is  of  the  mixed  type,  partly 
continuous,  inside  the  permissible  region  and  partly  discrete,  at  the 
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boundary  of  this  area.  The  situation  is  exemplified  by  Figure 
(4.3-4),  obtained  from  reference  T3-6,  page  354],  for  the  one- 
dimensional case.  In  case  a)  the  true  parameter  x satisfies  the 

o 

constraint  x ^ 0.  There  is  a positive  bias  since  the  distribution  of 
the  estimator  is  moved  to  the  right  by  the  constraint.  However,  if 
the  mean  square  error  is  computed,  taking  into  account  both  bias  and 
variance,  there  is  an  improvement,  because  that  distribution  tends 
to  be  more  concentrated  around  the  true  value.  Case  b)  shows  the 
opposite  situation.  The  constraint  x ^ 0 is  invalid,  that  is,  the  true 
(and  unknown)  parameter  xq  is  negative.  The  bias  is  still  positive 
but  there  will  be  an  improvement  in  the  mean  square  error  only  if 
Xq  is  not  too  far  from  0.  If  this  is  the  situation,  the  probability  mass 
concentrated  at  0 will  contribute  less  for  the  mean  square  error 
than  the  part  of  the  distribution  for  negative  values  of  x.  Zellner 
T4-14  and  4-15]  calculated  the  moments  of  this  mixed  distribution  in 
the  one -dimensional,  gaussian  case.  His  conclusions  can  be  sum- 
marized in  the  table  below  that  gives  the  mean  square  error  of  the 

constrained  estimator  expressed  as  a fraction  of  the  mean  square 

x -nr 

error  of  the  unconstrained  estimator.  The  entry  — — measures 

a 

the  distance  between  the  trie  parameter  x and  the  value  a that  de- 

o 

termines  that  constraint  x ^ a as  a fraction  of  the  standard  deviation 
a of  the  unconstrained  estimator. 
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x -a  - 
o 3 

2.5 

2 

1.  5 

1 

0.  5 

1 

-0.5 

| 

-1.  5 

B 

-2.5 

-3.  0 

a 

1. 0 

8 

0.96 

0.89 

8 

8 

8 

0 . 66 

8 

8 

6.26 

9.0 

The  constrained  estimator  is  seen  to  be  superior  to  the  unconstrained 
even  if  the  true  parameter  slightly  violates  the  constraints.  In  the 
referred  papers,  Zellner  also  studied  the  distribution  of  a flexible 
bound  procedure  in  which  the  estimator  is  given  by  a linear  convex 
combination  of  the  unconstrained  and  constrained  estimators,  re- 
flecting the  lack  of  absolute  confidence  of  the  analyst  in  the  imposed 
bounds. 

When  the  dimensionality  of  the  problem  is  greater  than  one, 
the  difficulties  in  computing  the  sampling  distribution  or  even  the  first 
two  moments  increase  considerably.  Hocking  [4-16]  has  obtained 
closed  form  solutions  for  the  case  where  the  constraints  are  de- 
scribed by  a single,  smooth,  convex  surface.  Some  Monte  Carlo 
experiments  involving  small  dimensionality  have  been  reported  by 
Lee,  Judge  and  Zellner  [4-17]  in  the  estimation  of  transition  prob- 
abilities of  a Markov  probability  model. 

In  the  context  of  image  processing,  the  large  dimensionality 
of  the  problem  makes  any  attempt  to  calculate  the  sampling  distri- 
bution of  the  constrained  estimator  extremely  difficult.  However,  the 
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confidence  of  the  experimenter  in  the  imposition  of  the  constraints 
is  very  high,  at  least  for  the  lower  bound  x s 0,  due  to  the  fact  that 
they  come  from  physical  laws  governing  the  process  of  image  for- 
mation. As  a result,  by  extending  Zellner's  conclusions  to  a higher 
dimensional  space,  there  should  be  considerable  reduction  in  mean 
square  error  through  the  use  of  these  restrictions. 


5.  EXPERIMENTAL  RESULTS 
WITH  UNCONSTRAINED  RESTORATION 

In  this  chapter  the  experimental  results  obtained  with  com- 
puter simulation  studies  of  digital  image  restoration  are  described. 

In  order  to  expose  the  most  salient  features  of  the  statistical  and 
numerical  problems  found  in  the  restoration  under  a regression 
model,  a simplified  artificial  picture  has  been  generated.  The  pic- 
ture consists  of  an  8 x 8 pixel  image,  containing  a bright  square  of 
value  245  on  a constant  background  of  value  10  over  a 0 to  255  scale. 
Figure  (5-1)  displays  this  picture  used  as  the  object  of  blurring  and 
addition  of  noise  in  the  simulation  experiments. 

F or  the  purpose  of  displaying  pictorial  information,  two 
operations  had  to  be  performed  in  the  pictures  obtained  in  these  ex- 
periments. First,  the  images  were  blown  up  to  the  size  of  256  x 
256.  This  explains  the  checkerboard  pattern  that  results.  Second,  a 
redistribution  of  the  pixel  values  is  necessary.  This  redistribution 
consists  of  clipping  the  results  to  the  0-255  scale.  In  some  instances 
the  restored  values  will  be  far  from  this  interval  and  in  order  to  make 
a meaningful  judgment,  a display  of  the  actual  numerical  results  will 
be  made.  The  objective  of  the  experiments  was  to  investigate,  in  the 
context  of  image  restoration,  the  usefulness  of  the  regression  model 
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developed  in  chapter  3 when  no  additional  constraint  is  impeded  in 
the  restored  values. 

The  overdetermined  model  is  studied  first,  followed  by  the 
underdetermined  one.  The  computation  of  confidence  intervals  of 
some  parametric  functions  is  performed  and,  finally,  a few  results 
concerning  the  testing  of  hypothesis  is  presented. 

5.  1 Restoration  for  the  Over  deter  mined  Model 

Experiments  with  the  overdetermined  model  simulated  the 
following  real  problem.  The  blurred  and  noisy  image  of  an  object 
°f  finite  extent  (e.jj. , the  moon  on  a dark  background)  is  available. 

As  pointed  out  before,  the  use  of  digital  processing  requires  that 
this  blurred  image  be  sampled  at  a finite  number  of  points.  Further- 
more, the  original  image  has  to  be  estimated  based  also  on  a finite 
number  of  points  that  are  the  nodes  of  the  quadrature  integration. 

The  experiments  represent  the  situation  in  which  the  analyst  decides 
to  place  those  nodes  at  equally  spaced  points  on  a rectangular  grid 
over  the  finite  object,  taking  into  account  the  several  factors  dis- 
cussed in  section  2.  3.  The  8x8  original  picture  of  Figure  (5-1) 
represents  the  original  object  as  if  it  were  available  at  these  quad- 
rature nodes.  The  blurred  image  is  assumed  to  be  sampled  at  points 
separated  by  the  same  distance  as  the  nodes  of  quadrature.  There 
are  more  sampling  points  than  nodes,  covering  a blurred  picture  that 
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is  larger  in  extern  than  the  original.  Some  of  the  sampling  points 
coincide  with  the  quadrature  nodes.  The  blurred  picture  is  still  of 
finite  extent,  which  implies  a point  spread  function  of  finite  support. 


This  represents  an  approximation  to  the  real  case,  obtained  by  trun- 
cation of  the  kernel.  In  the  experiment  tne  support  was  taken  as  a 
multiple  of  the  sampling  distance  and  the  kernel  was  assumed  to  be 
zero  beyond  two  times  this  distance.  The  full  extent  of  the  blurred 
picture  was  assumed  available.  Under  these  conditions,  the  original 
8x8  object  was  blurred  into  a 12  x 12  image.  Using  the  notation  of 
equation  (2.  3-4)  the  following  conditions  describe  the  experiments 


I = J = N = 8 
K = L = M = 12 


(5.  1-1) 


Figure  (5.  1-1)  describes  the  data  arrays  involved.  In  that  figure  a 
translation  of  the  enumeration  of  the  original  picture  was  done.  The 
values  of  and  n^were  made  to  run  from  + l)  to 

Under  these  conditions,  the  truncation  of  the  impulse  respons< 
described  by 


se  is 


h<Wfy  V = 0 


(5.1-2) 


or 
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ORIGINAL  PICTURE 


(1,1! 


(I.M) 


BLURRED  PICTURE 


Figure  (5.1-1)  The  Data  Arrays 
in  the  Overdetermined  Model 


| n I M - N L - 1 

lal  • Pj  I > ~ = — 
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or 


VV  > 


M - N _ L - 1 
2 2 


By  using  the  techniques  described  in  section  2. 3,  a trans- 
formation of  the  two  dimensional  arrays  into  vector  form  is  done. 
It  results  a model  described  by  the  equation 


X = B*  + 2. 


where 


(5.1-3) 


X = (M  x 1)  vector 

B = (M2  x N2)  matrix 
2 

x = (N  x 1)  vector 
2 

n = (M  x 1)  vector 


The  description  of  the  simulation  experiments  leads  to  the 
following  structure  for  the  matrix  B.  First  partition  B in  submat- 

riCCS  “ i,  j °f  size  (M  x N)»  as  shown  in  Figure  (5.  1-2).  Then  each 
matrix  Bjj  is  composed  by  a similar  structure,  as  described  by 
Figure  (5.  1-3).  Observe  that  the  matrix  B has  considerable  structure 
being  very  sparse.  It  is  composed  of  a nonzero  diagonal  band  of  sub- 
ma trices  which,  in  turn,  contain  a nonzero  diagonal  band  of  elements. 
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In  a real  experiment  the  matrix  II  should  involve  both  the 
quadrature  weights  and  the  kernel  of  the  point  spread  function.  The 
weights  depend,  however,  on  the  type  of  quadrature  expression  that 
is  used.  In  order  to  circumvent  this  problem  a simplification  was 
made:  all  weights  were  assumed  to  be  equal  to  one.  Therefore,  the 
entries  of  the  blur  matrix  depend  only  on  the  point  spread  function. 

If  the  impulse  response  is  space  invariant,  then 

h(o,.  VV  v = h(V|V  VV  (5-'-4) 

Consequently,  the  columns  of  the  submatrices  of  are  shifted  ver- 
sions of  the  first  column  and  the  same  pattern  occurs  for  each  sub- 
matrix. Suppose,  furthermore,  that  the  impulse  response  matrix  H 
is  in  separable  form,  that  is,  h(a  § ; p.  , r\  ) is  expressed  by  the 

X K J 

product 


h(V§k  ; Pj,  V = h!(ai ; p.)  • h2(§k  ; r\j  (5.1-5) 

Then  it  can  be  shown  that  the  blur  matrix  13  is  given  by 

B = Bc©Br  (5.1-6) 

where  and  are  given  by  matrices  of  the  type  described  by 
Figure  (5.  1-4)  and©  denotes  the  Kronecker  product. 

Two  expressions  for  the  blur  have  been  used.  The  first  one 
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Figure  (5.  1-4)  Unidimensional  Blur  Matrix 
in  the  Separable,  Space  Invariant  Case, 
Overdetermined  Model 
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simulates  the  effect  of  atmospheric  turbulence  over  a long  exposure. 
The  blur  function  is  given  by 


} 


(5.1-7) 


where  the  coefficients  b^.  and  b^  control  the  amount  of  blur  imposed 
on  the  vertical  and  horizontal  directions,  respectively.  The  expo- 
nent 5/6  of  expression  (2.  1-5)  was  approximated  by  unity.  The 
second  blur  function,  also  space  invariant  and  in  separable  form, 
simulates  the  effect  of  a diffraction  limited  optical  system  as  given 
by 


(5.1-8) 


Both  blur  functions  are  truncated  to  dimension  L x L.  Once  the  pic- 
ture is  blurred  by  multiplying  the  vector  x by  the  blur  matrix  B^, 
gaussian  noise  from  a random  variable  generator  is  added  to  the  com- 
ponents of  the  vector  The  noise  is  uncorrelated,  with  a covariance 
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matrix  V given  by 

v - 2r 

y - <r  i (5. 1-9) 

Recall  that  under  the  gaussian  assumption  the  B.  L.U.E. 
estimator  is  also  the  maximum  likelihood  estimator,  besides  being 
the  least  squares  solution.  Furthermore,  since  the  over  deter  mined 
model  is  assumed,  the  solution  is  unique. 

Under  white  noise,  the  estimator  x is  obtained  through  the 
expres  sion 

x = (BTB)  1BT^  = B + £ (5.1-10) 

The  condition  that  the  spread  function  should  be  space  invariant  may 

be  explored  for  computational  purposes.  In  fact,  under  this  hypoth- 
T 

esis,  (B  B)  is  a block  Toeplitz  matrix  and  the  inversion  of  such  a 

matrix  can  be  performed  quite  efficiently  through  Fast  Fourier 

Transform  techniques  [5-1],  even  for  large  dimensions.  Further- 
T 

more,  B_  jr  represents,  in  this  case,  a discrete  convolution  opera- 
tion that  can  also  be  performed  by  the  use  of  two-dimensional  FFT. 
The  procedure  is  equivalent  to  the  one  used  by  Hunt  [2-40]  by 
making  the  coefficient  V equal  to  zero  in  Twomey's  method.  If  the 
blur  matrix  is  separable,  considerable  simplification  in  the  compu- 
tation of  the  pseudoinverse  can  be  achieved.  Using  (5.  1-6)  it  follows 


that  [5-2] 
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B+  = [(BC©BR|T  (Sc©br)]'1  (Bc©Br)T 
= 1®q  ©5r)  ®c  ©Sr)]"1  (b£©b£) 

= (b^Qb^Br)-1  (b!©bT, 


= t©C®C  > " ' © (Sr5r  )’ 1 ] (B  J © B 


T -1  T 

r5r»  j ®£ 


■®£C>',^©«&r»-,bI 


= ?c©Sr 


(5.  1-11) 


In  the  previous  derivation  the  following  identities  were  used  [5-3] 


(A©b)t  = at@b] 


(5.  1-12) 


(A©B)(F©G)  = (AF)  © (BG)  (5.1-13) 


(A©B)_1  = A"  1 © B_ “ 1 


(5.  1-14) 


Equation  (5.  1-11)  allows  the  computation  of  BT  through  the 
Kronecker  product  of  pseudoinverses  of  much  smaller  dimensions. 
Since  the  computational  methods  for  the  pseudoinverse  [3-3,  chapter 
V]  are  involved  and  particularly  sensitive  to  numerical  errors  due  to 
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round-off,  the  importance  of  (5.  1-11)  becomes  evident. 

The  method  used  for  the  computation  of  the  pseudoinverses 
T T 

— C and  2r  was  the  gradient  projection  by  Pyle  [3-3,  pages  69-74]. 

It  is  based  on  an  application  of  the  Gram-Schmidt  orthonormalization 
process.  In  the  course  of  the  method  some  care  must  be  exercised 
in  order  to  decide,  within  the  precision  of  the  machine,  whether  a 

vector  can  be  given  by  a linear  combination  of  the  previous  ones  or 
not. 

The  restored  image  has  been  computed  through  the  expression 

* = ®C  (5.1-15) 

for  several  values  of  blur,  under  noisy  or  noise-free  conditions, 

blurred  by  gaussian  or  sine2  functions.  In  the  case  of  no  noise,  the 

least  squares  problem  reduces  to  the  solution  of  the  system  of  linear 
equations 


^ ~ (5.  1-16) 

and,  under  the  assumption  of  full  column  rank  of  the  matrix  B (over- 
determined model),  the  solution  exists  and  It  is  unique,  given  by  the 

same  expression  as  the  estimator  for  the  white  noise  (equation 
(5.  1-10)). 


Figure  (5.  1-5)  shows  the  blurred  and 


restored  images  under 


Blurred  b = b - . 70  , Var  - 0 Blurred  b -b  - . 70  , Var  - 10 

V H v ri 

Gaussian  Blur  .Overdetermined  Gaussian  Blur  .Overdetermined 
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gaussian  blur  and  b = b = 0.70.  The  first  column  of  pictures  re- 

V H 

fers  to  the  no  noise  case.  It  can  be  seen  that  the  restored  image 
coincides  with  the  original  picture.  This  is  what  should  be  expected 
with  the  overdetermined  model:  since  the  solution  to  the  linear  sys- 
tem (5.  1-16)  is  unique,  the  original  picture  is  the  only  possible  solu- 
tion. In  the  second  column  of  pictures  white  gaussian  noise  has  been 
added  to  the  blurred  image.  Now  the  -estored  picture  is  n*.1  ■jqual  to 
the  original,  but  the  difference  is  relatively  small,  with  the  light 
square  in  the  middle  being  clearly  distinguishable. 

A remark  should  be  made  at  this  point:  even  though  all  the 
blurred  pictures  with  the  overdetermined  model  are  12  x 12,  only  the 
center  8x8  parts  are  displayed,  blown  up  to  256  x 256. 

Figure  (5.  1-6)  presents  the  same  results  as  Figure  (5.  1-5) 

for  different  values  of  the  blur  coefficients,  b = b = 2.  5.  The 

V H 

result  obtained  with  no  noise  shows  that  the  original  picture  is  still 
obtained,  although  a closer  inspection  of  the  numerical  values  will 
evidence  some  round-off  in  the  computation.  The  noisy  restoration 
differs  substantially  from  the  previous  result.  Even  though  the  noisy 
and  blurred  picture  is  visually  barely  different  from  the  noise  free 
case,  the  restored  image  differs  considerably,  with  the  bright  square 
in  the  middle  not  even  visible. 

Figure  (5.  1-7)  presents  the  results  for  the  case  b = b = 
5000.  For  the  noisy  restoration  it  is  observed  that  the  result  is 
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somewhat  in  between  the  two  previous  results  as  far  as  fidelity  of 
the  restored  picture  to  the  original  is  concerned. 

Figure  (5.  1-8)  shows  the  result  for  b^.  = b^  = 0.70  under 
more  severe  noise  conditions.  It  evidences  the  little  sensitivity  of 
the  solution  to  the  increase  in  noise  level  for  these  values  of  blur 
coefficients. 

These  experimental  results  suggest  that  the  blur  coefficient 
influences  considerably  the  amount  of  perturbation  on  the  solution 
with  respect  to  the  corrupting  noise.  The  concept  of  condition  num- 
ber developed  in  chapter  3 offers  an  adequate  framework  to  explain 
and  predict  the  behavior  of  the  solution  with  respect  to  the  perturba- 
tion represented  by  the  noise. 

Therefore,  the  condition  number  of  the  blur  matrix  B = 

— c®— R *°r  — c = — R Was  comPuted  as  a function  of  the  blur  coef- 
ficient. This  was  done  by  the  following  procedure.  Since  the  (M2 
2 

x N ) matrix  B is  given  by  the  Kronecker  product  of  two  (MxN)  ma- 
2 

trices  B^,  its  N singular  values  are  obtained  by  all  the  possible 
combinations  of  products  of  the  N singular  values  of  B [5-4]  . The 

c 

condition  number  c(B)  is  the  ratio  of  the  largest  to  the  smallest  sin- 
gular values  of  B,  which  is  equal  to  the  square  of  the  ratio  of  the 
largest  to  the  smallest  singular  value  of  Bc,  this  being  the  condition 
number  of  Bc . Now,  the  square  of  the  condition  number  of  B is  the 

__  c 

condition  number  of  (b£bc)  [3-8,  page  223]  which  can  be  calculated 
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by  multiplying  the  norm  of  (BCBC)  by  the  norm  of  (BCBC)  . This 
has  been  done  by  using  the  Froebenius  norm,  for  several  values  of 
the  blur  coefficient.  Figure  (5.  1-9)  shows  the  result  of  this  experi- 
ment. The  condition  number  is  maximum  for  moderate  values  of 
the  blur  coefficient.  The  curve  explains  the  results  obtained  with 
the  restoration  of  the  noisy  images.  In  fact,  b = b = 0.70  is  in  the 
initial  part,  with  low  values  of  c(B),  and  thus  implies  little  effect  of 
the  noise  on  the  restored  image,  b = b = 2.5,  on  the  other  hand, 
is  on  the  peak  of  the  curve  with  maximum  perturbation  and,  finally, 

b = b = 5000  gives  moderate  values  for  the  condition  number  and 
V H 

effect  of  the  noise.  The  same  type  of  curve  for  the  condition  number 

2 

was  observed  for  the  sine  spread  function.  Figure  (5.  1-10)  dis- 

T 

plays  the  results.  The  matrix  (BCBC)  becomes  nearly  singular  for 
moderate  values  of  the  blur  coefficient.  An  even  greater  variation 
of  the  condition  number  was  observed  as  compared  to  the  gaussian 
spread  function. 

2 

The  restoration  experiments  were  repeated  with  the  sine 
blur.  Figure  (5.1-11)  shows  the  results  for  b = b = 0.25.  The 
noise-free  restoration  reproduces  the  original  picture  while  the 
noisy  image  is  restored  to  values  that  are  very  close  to  the  original 
ones.  This  corresponds  to  the  very  small  condition  number  on  this 
part  of  the  curve.  Figure  (5.  1-12)  presents  the  experimental  re- 
sults for  by  = b^j  = 1.0.  In  the  noise  free  case,  even  though  the 
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Figure  (5.  1-11)  Examples  of  Restoration  with 


the  Overdetermined  Model  and  Sine  Blur -I 


A 


-w^ap*-'***  r*1  * 


Blurred  b = b =1.0,  Var  = 0 
2 V H 

Sine  Blur,  Overde  ter  mined 


Restored 


Blurred  b =b  =1.0  , Var  =40 
2 V H 

Sine  Blur,Overdetermined 


Restored 


Figure  (5.1-12)  Examples  of  Restoration  with 


the  Overdetermined  Model  and  Sine  Blur  -II 


156 


restored  image  visually  reproduces  the  original,  the  effect  of  the 
high  condition  number  can  be  observed  in  the  actual  numerical  re- 
sults presented  in  Figure  (5.  1-13).  The  perturbation  due  to  round- 
off is  noticeable  (single  precision  has  been  used  in  the  calculations). 
In  the  noisy  case,  the  displayed  restored  image  shows  the  enormous 
effect  of  the  perturbation,  but  in  order  to  give  an  idea  of  the  order  of 
magnitude  of  the  error  in  the  estimation,  the  non-clipped  numerical 
results  are  presented  in  Figure  (5.  1-14).  The  very  large  pertur- 
bations imposed  by  a high  condition  number  are  clearly  displayed. 

The  restoration  under  a moderate  value  of  condition  number,  under 
2 

sine  biur,  is  displayed  in  Figure  (5.  1-15). 

A complementary  point  of  view  to  explain  the  effect  of  the 
noise  on  the  restoration  can  be  given.  Equation  (3.  1-41  gives  the 
expression  of  the  covariance  matrix  of  the  estimator.  Under  white 
noise  conditions,  that  expression  reduces  to 

Va  = cj2(BTB)_1  (5.1-17) 

In  the  condition  number  is  high  the  matrix  (B  BJ  is  nearly  singular 
and  large  variances  of  the  estimated  values  are  expected.  On  the 
other  hand,  if  the  condition  number  is  low,  the  opposite  situation 
occurs  and  the  variances  of  the  estimated  values  are  reduced. 

Figure  (5.  1-16)  and  (5.  1-17)  display  the  condition  number 
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Figure  (5.  1-15)  Examples  of  Restoration  with 


k 


the  Overdetermined  Model  and  Sine  Blur-III 
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GAUSSIAN  BLUR 
N =8 


Figure  (5.  1-16)  Blur  Coefficient* 
Condition  Number  Curves  for  Gaussian 
and  Different  Number  of  Sampled  Values 


Figure  (5.  1-17)  Blur  Coefficient, 
Condition  Number  Curves  for  Sinc^  Blur 
and  Different  Number  of  Sampled  Values 
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curves,  for  a given  number  of  quadrature  nodes  (N  - 8),  varying  the 
number  of  sampled  values  (M),  while  maintaining  the  structure  of  the 
blur  matrix  given  in  Figure  (5.  1-21.  These  results  give  more  in- 
sight into  the  reason  for  the  existence  of  a maximum  of  the  condition 
number  curves.  This  is  due  to  the  truncation  of  the  point  spread 
function.  In  fact,  for  increasing  M,  the  number  of  points  where  this 
function  can  be  nonzero  is  increased  and  the  effect  of  the  truncation 
starts  only  for  higher  blur  coefficients.  Consequently,  the  curves 
for  different  values  of  M have  essentially  a common  ascending  branch 
and  the  descending  part  starts  at  varying  points  for  different  values 
of  blur  coefficients.  If  there  were  no  truncation,  the  curve  would 
approach  infinity  very  fast,  the  asymptotic  value  being  obtained  for 
the  smoothest  possible  kernel,  with  constant  value  one,  implying  a 
blur  matrix  with  rank  one.  With  the  truncation,  the  curves  show  a 
descending  branch  that  begins  at  the  point  where  the  increasingly 
wider  kernel  starts  to  be  cut  down  substantially.  Now,  for  increas- 
ing value  of  blur  coefficient,  the  curves  tend  to  a finite  value. 

These  curves  can  be  used  as  a guide  for  the  choice  of  the 
number  of  sampling  points,  once  th^  number  of  quadrature  nodes  is 
fixed.  For  a very  small  amount  of  blur  all  curves  coincide  so  that 
the  designer  may  choose  M = N with  almost  no  error.  Blur  in  this 
case  pla/s  no  role,  only  noise  will  affect  the  restoration.  With 
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increasing  amount  of  blur,  different  numbers  of  sampling  points  will 
give  different  values  of  condition  number.  If  a curve  on  an  asce-.ding 
branch,  is  chosen,  truncation  will  have  no  effect  on  the  kernel  but  a 
high  condition  number  will  impose  high  variances  on  the  estimators. 
If  a curve  on  a descending  branch  is  selected,  lower  variances  of  the 
estimators  will  be  obtained,  at  the  price  of  error  on  the  estimation 
of  the  continuous  function  due  to  the  truncation  error  in  the  discrete 
model.  Therefore,  a trade -off  between  the  variance  of  the  estima- 
tors and  the  truncation  error  of  the  discrete  model  can  be  charac- 
terized. 

Although  these  conclusions  are  drawn  based  on  the  particular 
model  discussed  in  this  section,  they  are  more  general.  This  comes 
from  the  fact  that  the  inverse  of  the  integral  operator  that  describes 
the  blur  is  unbounded.  Therefore,  the  closer  the  discrete  model  fol- 
lows the  continuous  one,  the  more  ill  conditioned  the  former  model 
tends  to  be.  A move  in  the  opposite  direction  reduces  singularity 
but  imposes  modeling  errors.  This  inevitable  dilemma  can  only  be 
broken  with  the  intervention  of  correct  a priori  knowledge  about  the 
solution. 

The  effect  of  changes  in  .he  blur  matrix  was  also  experimen- 
tally confirmed.  For  this  purpose  an  image  was  restored  using  a 
value  of  blur  coefficient  different  from  the  one  that  was  used  for  its 
blurring.  Figure  (5.  1-18)  shows  the*  result  for  noise-free  and 
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Figure  {5.  1-19)  for  noisy  observations.  The  perturbation  is  mi  h 
higher  in  the  noisy  case,  with  many  reversals  of  signs  in  the  solu- 
tion. This  in  accordance  with  the  conclusions  of  equation  (3.3-34), 
which  predicted  that  in  the  noise-free  case,  the  condition  number 
would  matter,  while  in  the  presence  of  noise  the  square  of  this 
quantity  would  determine  the  amount  of  perturbation. 

5.2  Restoration  for  the  Underdetermined  Model 

Another  set  of  experiments  has  been  performed  for  the  under- 
determined model,  i.e.  , when  the  number  of  quadrature  nodes  ex- 
ceeds the  number  of  observed  values.  The  following  real  situation 
is  simulated  by  there  experiments:  the  image  of  part  of  an  object  is 
taken  (e.g.,  the  photograph  of  a certain  region  by  an  earth  resources 
satellite);  as  in  the  overdetermined  model,  a decision  is  made  to 
place  the  nodes  of  the  quadrature  integration  at  equally  spaced  points 
on  a rectangular  grid.  The  sampling  points  are  separated  by  the 
same  distance  as  the  quadrature  nodes  and  they  coincide  with  some 
of  the  quadrature  nodes.  The  number  of  sampling  points  in  this  case 
will  be  determined  by  the  size  of  the  image.  The  point  spread  func- 
tion is  assumed  to  be  truncated  to  twice  the  sampling  distance,  like 
in  the  overdett rmined  model.  This  determines  the  extent  of  the 
original  picture  that  contributes  to  the  blurred  picture  and  only  the 
quadrature  nodes  that  make  a nonzero  contribution  with  this  trun- 
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cated  kernel  are  retained. 

Figure  (5.2-1)  describes  the  data  arrays.  The  vector  rep- 
resentation for  the  original  and  blurred  arrays  is  still  valid,  but  the 
matrix  B_  is  the  transpose  of  the  matrix  of  the  overdetermined  model. 
Figure  (5.2-2)  shows  the  partition  of  15  in  this  case.  The  structure 

of  the  submatrices  B . is  described  in  Figure  (5.2-3).  In  the  case 

J 

of  separable,  space  invariant  blur,  the  unidimensional  blur  matrix 
has  the  form  expressed  by  Figure  (5.2-4). 

The  experiments  have  been  performed  with  the  gaussian 
shaped  blur  and  white  gaussian  noise.  The  original  picture  is  com- 
posed of  (12  x 12)  pixels,  coinciding  with  the  (8  x 8)  picture  depicted 

in  Figure  (5.  i)  on  the  center  part. 

As  pointed  out  in  the  previous  chapter,  there  is  no  unbiased 
estimator  in  this  case  and  the  solution  of  the  least  squares  problem 
is  not  unique.  The  minimum  norm  solution  is  given  by 

g = B+;y  (5.2-D 

Since  the  blur  matrix  is  the  transpose  of  the  corresponding 
matrix  for  the  overdetermined  model,  it  follows  that  13  has  full  row 
rank  and  B^  can  be  given  by 


np  rp 

B+  = B (BB  ) 


(5.2-2) 


IMPULSE  RESPONSE  ARRAY 
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Figure  (5.2-1)  The  Data  Arrays 
in  the  Underdetermined  Model 
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The  assumption  of  shift  invariance  allows  the  computation  of 
(5.  2-1)  to  be  done  very  efficiently  using  Fast  Fourier  Transform 
algorithms.  In  the  separable  case,  using  a derivation  entirely  an- 
alogous to  the  one  that  led  to  equation  (5.  1-11),  it  is  possible  to 
conclude  that 


B+  = Bc+  © Br+  (5.2-3) 

The  noise-free  minimum  norm  solution  is  not  necessarilv  the  origi- 
nal picture  and  this  is  clearly  shown  in  Figures  (5.2-5)  and  (5.2-6), 
for  blur  coefficients  set  at  . 5,  5 and  500.  Note  that  only  the  center 
(8  x 8)  part  of  the  restored  (12  x 12)  picture  is  shown,  blown  up  to 
(256  x 256). 

The  noisy  restorations,  displayed  in  Figure  (5.2-7)  and 
(5.2-8)  show  the  same  pattern  of  the  overdetermined  model,  namely, 
small  perturbation  in  the  solution  due  to  noise  for  small  blur,  fol- 
lowed by  large  and  moderate  perturbations  for  increasing  values  of 
the  blur  coefficient.  This  fact  cannot  be  explained  by  the  condition 
number  since  it  is  infinite  in  this  case.  However,  since  13  is  the 
transpose  of  the  matrix  in  the  overdetermined  case,  and  considering 
the  fact  that  the  nonzero  eigenvalues  of  B^  B and  BB  are  the  same 

[3.  5,  page  41]  it  turns  out  that  the  ratio  of  the  largest  to  the 

T 

smallest  nonzero  eigenvalue  of  B B follows  the  curve  given  by 
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Figure  (5.2-5)  Restoration  for  the 
Underdetermined  Model  - Noise  Free  Case  I 


Blurred  by  = b =500  , Var  = 0 
Gaussian  Blur,  Underdetermined 


Restored 


Figure  (5.2-6)  Restoration  for  the 
Underdetermined  Model  - Noise  Free  Case  II 
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Figure  (5.  1-9,  Therefore,  the  ratio  of  the  largest  to  .he  smallest 
finite  principal  axes  of  the  degenerate  K -ellipsoid  follows  the  same 
curve. 

On  the  other  hand,  the  minimum  norm  solution  is  obtained  by 

projecting  the  origin  of  the  x-space  orthogonally  onto  the  subspace 

which  consists  of  null  space  of  B,  N(B),  added  to  B^_y.  Therefore, 

when  the  minimum  norm  solution  (B^y)  is  taken  from  this  subspace, 

no  variation  in  this  solution  due  to  noise  is  allowed  in  the  direction  of 

the  eigenvectors  that  span  N(B).  These  are  precisely  the  eigenvec- 

T 

tors  corresponding  to  the  zero  eigenvalues  of  B B.  Only  variations 

along  the  eigenvectors  corresponding  to  nonzero  eigenvalues  are 

allowed.  These  variations  are  in  the  (nondegenerate)  ellipsoid  that 

consists  of  the  intersection  of  the  original  (degenerate)  ellipsoid  and 

the  hyperplance  that  passes  through  the  origin  and  is  orthogonal  to 
. T T 

N'B),  that  is,  the  range  of  B , 1<(B  ).  The  shape  of  this  ellipsoid 
is  the  same  as  tho  shape  of  the  ellipsoid  of  the  dual,  overdetermined 
model  because  the  eigenvalues  are  identical.  Therefore,  the  varia- 
tions of  the  solution  of  the  underdetermined  model  In  this  subspace 
of  restricted  dimensions  should  be  of  the  same  type  as  in  the  corres- 
ponding overdetermined  model.  Viewed  from  another  perspective, 
this  situation  can  be  described  as  follows:  by  projecting  the  origin 
onto  N(B)  added  to  to  obtain  the  minimum  norm  solution,  B^jr , 


the  infinite  variances  of  t’  ' unde  rde  ter  mined  model  are  avoided, 
leaving  only  the  finite  variances  in  the  directions  of  the  nondegen- 
erate axes  of  the  ellipsoid.  This  is  done  at  the  price  of  imposing 
bias,  since  the  lack  of  information  in  the  sample  is  compensated, 
not  by  a correct  a priori  information  about  the  original  picture,  but 
by  merely  imposing  a minimum  norm  solution.  This  trade-off 
between  bias  and  variance  is  somehow  analogous  to  the  one  between 
modeling  error  and  high  condition  number  in  the  choice  of  the  size  of 
the  point  spread  function.  A similar  ‘nation  will  also  occur  with 
the  use  of  linear  equality  and  inequality  constraints  in  the  restoration. 


5.3  The  Computation  of  Confidence  Intervals  and  Hypothesis 

Testing  in  the  Over  determined  Model 

Some  computational  work  has  been  performed  with  the  ob- 
jective of  determining  both  confidence  intervals  and  results  of  hy- 
pothesis testing  in  the  linear  model  for  restoration.  In  order  to 
simplify  the  calculations,  the  unidimensional  regression  model  has 
been  employed. 

Figures  (5.3-1)  and  (5.3-2)  present  the  results  of  the  compu- 
tation of  the  68%  confidence  interval  for  individual  pixel  values,  under 


gaus8ian  and  sine  blur,  respectively.  The  correlation  of  these 


curves  with  those  of  the  condition  number  is  clear.  The  higher  this 
quantity  the  greater  the  confidence  interval  for  the  same  pixel  value , 


jure  (5.  3-1)  68%  Confidence  Intervals  for  the  Unidimensional, 
Over  determined  Model,  Gaussian  Blur,  Unit  Variance 
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Figure  (5.3-Z)  68%  Confidence  Intervals 
for  the  Unidimensional,  Overdetermined  Model,  Sine2  Blur, 
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due  to  the  larger  variances  involved.  Furthermore,  for  large  values 

2 

of  blur  coefficient,  the  curves  for  gaussian  and  sine  blur  tend  to 
coincide.  This  is  due  to  the  fact  that  the  truncated  spread  functions 
assume  the  constant  value  unity  when  the  blur  coefficient  tends  to 
infinity. 

The  hypothesis  testing  experiments  involve  one  pixel  value  in 

2 

the  unidimensional  model,  with  sine  blur  and  white  gaussian  noise. 
Two  distinct  sets  of  tests  have  been  performed,  the  first  with  known 
variance,  using  the  normal  distribution,  and  the  second  under  un- 
known variance,  making  use  of  the  Student's  distribution.  The 
reader  is  referred  to  Appendix  A for  the  theoretical  material  con- 
cerning hypothesis  testing. 

In  both  tests  the  level  of  confidence  is  set  at  10%;  the  tests  are 
two-sided,  testing  the  fourth  pixel  value,  with  the  null  hypothesis  Hq 
bc*ng  x4  = A against  the  alternative  hypothesis  x4  i A,  for  different 
values  of  A;  the  true  value  of  x4  is  set  at  245,  the  variance  of  the 
gaussian  noise  is  50  in  both  tests.  Tables  5.  3-1  and  5.3-2  present  the 
results  for  the  normal  and  Student's  distribution  tests,  respectively. 

Again,  the  correlation  of  the  testing  results  with  the  condition 
number  is  evident  through  the  inspection  of  the  tables.  A higher  con- 
dition number  is  associated  with  a higher  variance  of  the  statistics 
used  in  ‘;he  test.  For  a given  size  of  the  test  (or  probability  of  false 
alarm),  fixed  by  the  Neyman  Pearson  criterion,  the  power  (or 
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Table  5.3-1  Hypothesis  Testing 
For  Pixel  Values,  Normal  Distribution 


a)  Variance  known  (normal  distribution) 
Level  of  confidence:  10%  (two-sided) 
True  value:  245 


Variance:  50 


Blur:  sine  (diffraction  limited) 


H : x = A 
o 4 


H.:  x i A 
l 4 


Blur 

Coefficient 

Condition 

Number 

' 245 

145 

45 

0.25 

1.0 

-0. 156 
(accept) 

12.  62 
(reject) 

12.40 

(reject) 

1.  0 

2500 

0.999 

(accept) 

l.  20 
(accept) 

1.40 

(accept) 

500 

13.0 

-1.42 

(accept) 

1.40 

(accept) 

4.23 

(reject) 

■k.  . a a . , » - 
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Table  5.3-2  Hypothesis  Testing 
For  Pixel  Values,  Student's  Distribution 


b)  Variance  unknown  (Student's  distribution) 

Level  of  confidence  : 10%  (two-sided) 

True  value:  245 

Variance:  50 
2 

Blur:  sine  (diffraction  limited) 

Degrees  of  freedom:  12-8  = 4 


Blur 

Coefficient 

H : x = A 
o 4 

Condition 

Number 

Hr  x4  ! 

i A 

A 

245 

145 

45 

0.25 

1.0 

-0. 140 

1 1. 34 

22.83 

(accept) 

(reject) 

(accept) 

1.0 

2500 

0.751 

0.902 

1.05 

(accept) 

(accept) 

(accept) 

500 

13.0 

-1.49 

1.46 

4.41 

(accept) 

(accept) 

(reject) 

Decision  Regions 


Reject  H 


Accept  H 


Reject  H 


-2. 132 


+2. 132 


probability  of  detection)  decreases  with  the  variance.  This  explains 
the  smaller  probability  of  rejecting  the  null  hypothesis  when  it  is 
false,  obtained  for  higher  condition  numbers. 


6.  EXPERIMENTAL  RESULTS  WITH  LINEAR 
CONSTRAINED  RESTORATION 

In  this  chapter  the  results  obtained  with  linear  constrained 
restoration  will  be  discussed.  The  first  section  presents  experiments 
with  linear  equality  constraints  while  the  second  section  focuses  on 
linear  inequality  restrictions. 

6.  1 Equality  Constraints 

As  a first  attempt  to  overcome  the  instabilities  found  in  the 
use  of  unconstrained  restoration  with  regression  techniques,  a single 
equality  constraint  has  been  imposed  on  the  overdetermined  model. 
The  constraint  consists  of  restricting  the  sum  of  the  restored  values 
to  be  equal  to  the  sum  of  the  original  pixels.  Since  the  analyst  would 
not  have  direct  access  to  this  value  in  a real  world  experiment,  the 
sum  has  been  varied.  The  application  of  the  Toro  Vizcarrondo- 
Wallace  test  showed  that  there  was  an  improvement  in  the  mean 
square  error  for  considerable  variation  of  the  constrained  value, 
under  a given  confidence  level.  However,  the  variation  of  the  numeri 
cal  answer  was  minimal.  This  can  be  explained  in  view  of  the  fact 
that  for  small  condition  number  the  unconstrained  solution  nearly 
satisfies  the  constraint  and  with  moderate  or  large  condition  number 
the  instabilities  are  in  the  form  of  oscillations  from  pixel  to  pixel. 


185 


186 


The  imposition  of  a given  value  for  the  sum  of  the  restored  pixels 
simply  cha.iges  the  D.C.  component  of  the  waveform  without  affecting 
the  oscillations.  Adopting  the  point  of  view  of  ellipsoids,  this  means 
that  this  elimination  of  one  dimension  in  a 8 x 8 = 64 -dimen sionat 
ellipsoid  does  not  seem  to  be  done  in  the  direction  of  the  eigenvectors 
corresponding  to  the  smallest  singular  values. 

In  order  to  obtain  a reasonable  decrease  in  the  variance,  a 
higher  dimensional  and  more  appropriate  restriction  should  be  neces- 
sary. Considering  both  the  nature  of  the  image  and  the  characteristic 
of  the  oscillations,  it  was  felt  that  the  restriction  that  pairs  of  ad- 
jacent pixels  should  be  equal  would  tend  to  damp  out  the  oscillations. 
This  is  a 32-dimensional  restriction  in  a 64 -dimensional  space.  In 
the  particular  case  of  the  image  used  in  these  experiments  this  re- 
striction is  satisfied  by  the  original  image.  In  other  cases,  the  im- 
position of  these  constraints  will  represent  a smoothing  of  the  solution 
in  relation  to  the  original.  Observe  that  the  rows  of  the  matrix  A of 
equation  (4.2-1)  have  in  this  case  the  property  of  being  shifted  ver- 
sions of  the  first  row  which  opens  the  possibility  of  computation  of 

(4.2-2)  by  Fourier  Transform  methods  in  the  case  of  space  invariant 
blur  and  white  noise . 

Equation  (4.2-2)  has  been  solved  using  the  simplification  that 
comes  from  the  assumption  of  separable  blur  functions  and  white 
noise.  This  implies  that  the  matrix  (HTH)  can  be  inverted  by  taking 
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the  Kronecker  product  of  smaller  matrices. 

Figures  (6.  1-1),  (6.  1-2),  and  (6.  1-3)  compare  the  results  of 
unconstrained  and  linear  equality  constrained  restorations  in  the 
overdetermined  model  for  gaussian  shaped  blur,  under  the  situations 
of  small,  large  and  moderate  condition  number.  With  small  con- 
dition number  the  constrained  restoration  differs  very  little  from  the 
unconstrained  case  and  both  are  very  close  to  the  true  value.  It  is 
with  larger  values  of  condition  number  that  the  effect  of  the  con- 
straints, blocking  the  oscillatory  nature  of  the  unconstrained  esti- 
mator, can  be  observed. 

Similar  results  have  been  obtained  with  diffraction  limited  blur, 
shown  in  Figures  (6.  1-4),  (6.  1-5),  and  (6.  1-6).  In  this  case  the 
problem  can  become  extremely  ill  conditioned,  and  for  blur  coef- 
ficients equal  to  1.0  the  round-off  error  in  evaluating  the  inverse  of 
the  matrix  i~a  (H  H)  A]  prevents  a meaningful  result  to  be  ob- 
tained, so  that  the  high  condition  number  situation  is  exemplified  by 

the  somehow  better  conditioned  case  of  b = b =13 

y H • J * 

In  both  types  of  blur  the  statistics  for  the  F-test  have  been  com- 
puted and,  under  any  reasonable  confidence  level,  the  hypothesis 
specifying  that  the  linear  relationships  are  true  is  accepted,  con- 
firming their  validity. 

'Ihe  results  obtained  with  both  types  of  blur  seem  to  be  indica- 
tive of  the  degree  of  damping  of  the  oscillations  that  can  be  achieved 
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Figure  (6.  1-1)  Comparison  of  Unconstrained 
and  Equality  Constrained  Restorations 
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Figure  (6.  1-2)  Comparison  of  Unconstrained 
and  Equality  Constrained  Restorations 
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Figure  (6.  1-4)  Comparison  of  Unconstrained 
and  Equality  Constrained  Restorations 
Sine2  Blur,  by  = b^  = . 25 


Figure  (6.  1-5)  Comparison  of  Unconstrained 
and  Equality  Constrained  Restorations 

Sinc^  Blur,  by  = b^  = 1.3 


194 


by  the  use  of  linear  equality  constraints  involving  only  a reasonable  a 
priori  knowledge  of  the  smoothness  of  the  solution.  There  is  a trade- 
off  between  the  elimination  of  the  oscillations  and  the  achieved  reso- 
lution of  me  picture. 

The  use  of  the  linear  equality  constrained  method  leads  to  the 
determination  of  confidence  intervals  for  parametric  functions  of 
pixel  values.  The  unidimensional,  overdetermined  regression  model 
has  been  set  up,  with  gaussian  shaped  blur.  The  dimensions  of  the 
vector  of  observations  and  the  vector  of  original  pixel  values  are 
(12  x 1)  and  (8  x 1)  respectively.  The  fourth  pixel  value  has  been  con- 
strained to  vary  from  0 to  500  and  the  norm  of  the  residual  vector, 
U^-HxlJ  is  computed.  Figure  (6.  1-7)  shows  the  result  for  different 
values  of  the  variance.  These  curves  give  examples  of  the  type  of 
result  represented  in  Figure  (4.3-3)  for  the  case  when  no  inequality 
constraint  is  imposed  upon  the  solution. 

Observe  that  the  minimum  value  of  each  curve  is  obtained  for 
the  pixel  value  x4  that  corresponds  to  the  unconstrained  solution.  It 
is  only  the  true  value  (245)  for  the  no  noise  case.  In  this  case  the  K- 
ellipsoids  are  degenerate  so  that  the  parabola  also  reduces  to  straight 
line  s . 

The  confidence  interval  foi  the  95%  confidence  level  has  been 
computed  using  the  K value  given  by  the  table  of  normal  distribution, 
as  described  in  chapter  3.  It  should  be  remarked  that  the  parabola 
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for  Var  = 500  is  more  open  than  for  Var  = 30,  indicating  that  larger 
confidence  intervals  are  obtained  in  the  former  case. 

6.2  Inequality  Constraints 

The  experiments  with  linear  inequality  constrained  restoration 
involve  the  solution  of  the  quadratic  programming  problems  discussed 
in  chapter  4.  The  problems  have  been  solved  through  the  use  of  the 
Dantzig's  algorithm  [6-1].  The  lower  and  upper  bounds  were  set  at 

0 and  255,  respectively.  An  attempt  to  solve  the  two-dimensional 

problem  directly  has  to  face  the  serious  storage  requirements  of  the 
T 

matrix  (H  H)  of  the  quadratic  expression,  involving  approximately 

1 ^4 

J x N elements,  where  N is  the  size  of  one  dimension  in  the  original 
image.  Moreover,  the  attempt  revealed  numerical  convergence  prob- 
lems due  to  the  large  amount  of  computations  involved,  ev  n though 
double  precision  was  used.  Therefore,  the  two-dimensional  problem 
was  solved  by  a sequence  of  solutions  involving  first  restorations  on 
the  rows  and  then  restorations  on  the  columns.  This  is  clearly  an  ap- 
proximate method.  The  approximation  tends  to  be  better  when  there  is 
little  blur.  This  occurs  because  on  one  hand  a moderate  amount  of 
blur  implies  large  condition  number  and  the  input  to  the  column  resto- 
ration will  be  clipped.  This  tends  to  invalidate  the  model  of  additive 
noise  that  is  the  basis  of  the  quadratic  programming  algorithm.  On 
the  other  hand,  for  large  blur,  with  the  truncation  of  the  point  spread 


■ 


r 


197 

function,  the  condition  number  is  moderate,  but  now  the  random  vari- 
ables in  the  same  row  will  tend  to  be  correlated  and  this  is  not  taken 
into  account  in  the  column  restoration. 

The  linear  inequality  constrained  restorations  have  been  per- 

2 

formed  under  gaussian  or  sine  blur,  for  different  values  of  condition 
number.  Figures  (6.  2- 1),  (6.2-2),  and  (6.  2-3)  illustrate  the  results 
for  gaussian  shaped  blur,  while  Figures  (6.2-4),  (6.2-5),  and  (6.2-6) 
refer  to  the  diffraction  limited  case.  The  improvement  over  the  un- 
constrained restoration  is  clear,  particularly  with  high  condition 
number.  Figure  (6.2-5)  illustrates  an  example  of  a completely  un- 
feasible restoration  using  straightforward  regression  techniques  be- 
coming feasible  by  the  addition  of  inequality  constraints.  The  im- 
provement by  the  use  of  this  type  of  restriction  is  greater  than  with 
the  equality  constraints,  although  a much  higher  computational  task 
has  to  be  performed.  The  solution  of  each  eight  variable  quadratic 
programming  problem  took  between  6 and  7 seconds,  completing  10 
or  12  iterations  of  the  algorithm.  However,  most  of  this  time  was 
spent  writing  and  reading  from  the  disk  where  the  data  is  stored.  It 
is  felt  that  a substantial  reduction  in  time  is  possible  by  all-in-core 
programming.  It  should  be  noted  that  the  upper  and  lower  bounds  of 
the  quadratic  programming,  respectively  255  and  0,  determine  exactly 
the  range  of  values  that  a-e  displayed.  Therefore,  while  the  uncon- 
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Figure  (6.  2-2)  Comparison  of  Unconstrained 
and  Inequality  Constrained  Restorations 
Gaussian  Blur,  b^.  = =2-5 
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Figure  (6.2-4)  Comparison  of  Unconstrained 
and  Inequality  Constrained  Restorations 

Sinc^  Blur,  b...  = b = .25 
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Figure  (6.2-5)  Comparison  of  Unconstrained 
and  Inequality  Constrained  Restorations 

Sinc^  Blur,  b^.  = b^  =1.0 
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strained  or  equality  constrained  results  are  clipped  to  these  bounds, 
sometimes  from  much  higher  or  lower  values,  the  displayed  inequality 
constrained  restored  pictures  reflect  precisely  the  numerical  results. 

The  determination  of  approximate  confidence  intervals  for 
parametric  functions  of  pixel  values  is  also  done.  Figure  (6.2-7) 
illustrates  the  comparison  with  the  unconstrained  restoration.  Ob- 
serve that  both  confidence  intervals,  unconstrained  and  constrained 
are  pessimistic.  In  fact,  these  intervals  were  computed  using  the  Y 
value  which  is  approximately  1.89  times  greater  than  the  K-value  used 
in  Figure  (6.  1-7)  for  the  unconstrained  case.  This  makes  the  uncon- 
strained confidence  interval  larger  than  in  that  figure,  in  order  to  be 
compared  with  the  constrained  interval.  The  last  interval  is  also 
pessimistic  for  the  reasons  stated  in  chapter  4. 


Figure  (6.2-7)  Comparison  Between  Unconstrained 
and  Inequality  Constrained  Approximate  Confidence  Intervals 


7.  CONCLUSIONS  AND  SUGGESTIONS 
FOR  FUTURE  RESEARCH 


In  this  dissertation  an  attempt  has  been  made  to  put  the  prob- 
lem of  image  restoration  on  a firm  theoretical  basis.  The  linear  in- 
tegral equation  that  describes  the  imaging  formation  process  has  been 
discretized  in  order  to  adapt  the  method  to  the  requirements  of  digital 
computation.  This  leads  to  a regression  model  whose  numerical  and 
statistical  properties  have  been  extensively  studied  in  this  work. 
Through  the  use.  of  this  regression  model  it  has  been  possible  to  in- 
corporate the  large  body  of  knowledge  developed  in  mathematical 
statistics,  econometrics,  optimization  theory  and  numerical  analysis 

into  the  field  of  image  restoration. 

The  first  developed  method  consists  of  the  solution  of  the  least 
squares  problem  by  the  set  of  normal  equations.  This  solution 
assumes  the  minimum  possible  amount  of  a priori  information  about 
the  image  to  be  restored.  In  the  regression  model  the  vector  of  pixel 
values  representing  the  image  is  simply  a set  of  parameters  to  be 
determined.  A price  is  paid,  of  course,  for  this  lack  of  knowledge. 

In  the  case  of  the  overdetermined  model  the  restoration  problem  can 
become  extremely  ill  conditioned  and  the  solution  may  exhibit  wildly 
oscillating  behavior.  The  study  of  the  variation  of  the  condition 
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number  as  a function  of  the  blur  coefficient  has  been  made,  for  two 
types  of  kernel,  namely,  atmospheric  turbulence  and  diffraction  limi- 
tation. The  effect  of  truncation  of  the  kernel  has  been  shown  to  be 
one  of  reduction  of  condition  number.  A trade-off  is  developed  be- 
tween the  uncertainty  in  the  estimators,  when  the  discrete  model  ap- 
proximates the  continuous  one,  and  the  error  in  modeling  when  the 
opposite  approach  is  taken.  The  use  of  the  underdetermined  model, 
representing  an  attempt  by  the  experimenter  to  estimate  more  points 
than  observed  values  implies  infinite  variances  of  the  estimator. 

When  a pseudoinverse  solution  is  obtained  this  unbounded  uncertainty 
is  traded  for  an  unknown  bias  in  the  estimators. 

It  becomes  clear  that,  in  order  to  obtain  meaningful  solutions, 
some  sort  of  a priori  knowledge  has  to  be  introduced.  Most  of  the 
prior  work  in  image  restoration  has  employed  probabilistic  a priori 
information,  as  a Bayesian  type  of  approach  to  the  problem.  This 
dissertation  has  explored  deterministic  a priori  constraints,  of  two 
types:  linear  equality  and  linear  inequality  restrictions.  A com- 
parison with  existing  methods  of  solution  has  been  made.  In  particu- 
lar, it  has  been  shown  that  the  smoothing  and  regularization  techniques 
for  solving  the  discrete  version  of  the  Fredholm  integral  equation  of 
the  first  kind  can  be  interpreted  as  least  squares  with  quadratic 
equality  constraints. 

The  use  of  linear  equality  constraints  has  shown  the  advantage 
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of  being  able  to  formulate  the  restrictions  in  terms  of  a hypothesis 
that  could  be  tested,  revealing  to  the  experimenter  the  validity  or  not 
of  these  restrictions.  The  test  could  be  performed  by  verifying  the 
restrictions  themselves  (F-test)  or  testing  whether  there  is  an  im- 
provement in  mean  square  error  by  the  use  of  these  (perhaps  incor- 
rect) constraints  (Toro  Vizcar rondo-Wallace  test).  A moderate 
visual  improvement  has  been  observed  by  the  use  of  linear  equality 
constraints  imposing  that  pairs  of  adjacent  pixels  should  be  equal. 
Nevertheless,  for  high  condition  number,  large  oscillations  are  still 
present  in  the  solution. 

When  inequality  constraints  are  applied  by  solving  the  quad- 
ratic programming  problem,  better  results  are  obtained,  damping 
the  oscillations  even  in  the  high  condition  number  situation.  This  is 
done  at  the  price  of  increased  computational  requirements,  in  terms 
of  both  storage  and  time.  It  is  felt  that  restoration  of  large  images  by 
quadratic  programming  could  only  be  done  in  blocks,  using  an  all -in- 
core  algorithm.  Its  use  could  be  justified  when  large  condition  num- 
bers are  involved  in  the  restoration  of  valuable  imagery. 

In  the  analysis  of  the  effect  of  perturbations  on  the  solution  of 
the  least  squares  problem  it  has  been  pointed  out  that  the  condition 
number  for  changes  in  the  blur  matrix  tends  to  increase  as  the  square 
power  when  noise  is  present.  As  a consequence,  extreme  care  must 
be  taken  in  the  measurement  of  the  blur  function,  particularly  when  a 
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high  condition  number  is  involved. 

The  research  pursued  in  this  dissertation  may  be  extended  in 
several  directions.  A more  detailed  study  of  the  interplay  of  the  dif- 
ferent sources  of  error  in  the  estimation  of  the  contiguous  image, 
when  a discrete  model  is  used,  would  be  of  considerable  interest. 

The  choice  of  the  number  and  location  of  nodes  of  quadrature  and 
sampling  points  affects  the  error  of  numerical  quadrature,  the  model- 
ing eiror  due  to  the  truncation  of  the  point  spread  function,  the  vari- 
ance of  the  estimators  and  the  aliasing  error.  The  problem  is  of 
considerable  complexity  and  one  should  be  probably  satisfied  with  a 
sub  optimal  solution.  The  study  of  the  condition  number  for  different 

types  of  quadrature  formulas  and  different  kernels  would  provide 
valuable  information. 

Another  area  that  deserves  some  exploration  consists  of 
modeling  the  lack  of  knowledge  of  the  blur  matrix  by  a probabilistic 
description,  leading  to  the  use  of  stochastic  regressors.  This  will  be 
particularly  valuable  in  the  lestoration  of  images  taken  through  tur- 
bulent atmosphere,  when  the  duration  of  the  exposure  is  not  long 
enough  so  that  the  blur  function  stabilizes.  This  work  would  comple- 
ment the  study  by  Slepian  [1-9],  by  considering  for  example,  statis- 
tical dependence  between  noise  and  the  point  spread  function. 

The  use  of  recursive  computational  schemes  of  the  regression 
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estimators  would  be  particularly  useful  in  real  time  applications, 
when  the  stream  of  data  comes  from  a row  or  column  scanning  of  the 
image.  Another  possibility  would  be  the  exploration  of  iterative  meth- 
ods to  solve  the  least  squares  problem,  in  the  unconstrained  or 
equality  constrained  constrained  cases.  This  method  is  equivalent 
to  solving  the  linear  system  of  normal  equations  in  an  iterative  man- 
ner. Finally,  the  area  of  nonlinear  regression  c ould  provide  an  ade- 
quate framework  to  deal  with  a situation  in  which  the  blur  cannot  be 
modeled  by  a linear  operation. 


APPENDIX  A 


HYPOTHESIS  TESTING  IN  THE 
OVERDETERMINED  MODEL 


One  of  the  advantages  of  the  model  developed  for  the  image 
restoration  problem  is  its  feature  of  being  mathematically  very 
tractable.  In  particular,  it  offers  the  framework  of  extensive  hy- 
pothesis testing  that  seems  well  suited  to  solve  several  image  de- 
tection problems.  In  chapter  4 the  linear  equality  constrained 
restoration  combines  estimation  and  hypothesis  testing  in  a common 
framework  of  image  restoration. 

The  assumption  of  an  overdetermined  model  (M  > N)  will  be 
made,  together  with  the  hypothesis  of  white  Gaussian  noise  corrupt- 
ing the  image.  Nevertheless,  some  recent  work  in  the  statistical 
literature  seem?  to  indicate  that  some  of  these  tests  are  robust,  in 
the  sense  that  they  seem  to  perform  well  even  when  the  Gaussian 
assumption  is  violated  [3-6,  page  615]. 

Suppose  first  that  a linear  functional  of  the  pixel  values  is  to 
be  tested.  In  this  case  the  experimenter  could  be  interested,  for 
example,  in  testing  whether  a single  pixel  is  equal  to  a certain  value 
or  not  or  whether  the  sum  of  all  pixel  values,  representing  an  inte- 
gral over  the  entire  picture,  is  also  equal  to  a prespecified  value 
or  not  equal. 
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Assume  that  the  linear  functional  is  represented  by  the  inner 
T 

product  £ x and  that  the  hypothesized  value  (represented  by  the 
hypothesis  H^)  is  cq.  Under  the  hypothesis  Hq,  the  random  variable 


T 

C X - c 
— — o 


(A  - 1 ) 


is  zero  mean  and  Gaussian.  Its  variance  is  given  by 


2 T T - 1 
a (c  (H  H)  c) 


Thus,  under  H , 
o 


a 


rT  T - 1 

& (H  H)  £ 


(A-2) 


(A-3) 


is  a zero  mean,  unit  variance  Gaussian  random  variable.  In  order 

to  perform  the  test  under  a given  level  of  significance,  a table  of 

standardized  normal  distribution  should  be  used.  The  threshold 

a > 0 is  chosen  such  that 
a 

P{N(0,  1)  > aa}  = P { -N(0,  1)  < -a^  = (A -4) 


where  a is  the  level  of  significance  of  the  test  also  known  as  the 
probability  of  false  alarm,  using  detection  theory  terminology.  If 


the  raidom  variable  given  by  (A -3)  falls  between  -a  and  a , Ho  is 

a a 

accepted.  Otherwise  H is  rejected. 

o 

It  should  be  noted  that  Hq  is  a simple  hypothesis  and  the 
alternative  Hj  is  a composite  hypothesis.  It  is  well  known  that  there 
is  no  uniformly  most  powerful  test  involving  a Gaussian  random 
variable  in  this  situation.  The  test  specified  in  the  previous  para- 
graph can  be  shown  to  be  the  uniformly  most  powerful  unbiased  test 

for  the  simple  hypothesis  Hq  against  the  alternative  composite  hy- 
pothesis H j. 

Assume  now  that  the  variance  of  the  noise  is  not  known. 

The  procedure  in  this  case  would  be  equivalent  to  performing  an 
estimation  of  the  variance  prior  to  testing.  In  the  section  on  con- 
fidence intervals  it  was  shown  that  the  variance  of  the  parametric 
function  could  be  estimated  by 

2 T T -1 

s (c  (H  H)  c)  (A -5) 

where 


As  a result,  the  statistic 
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T 

C X - c 


/T  T -1 
c (H  H)  o 


(A-7) 


is  Student  distributed  with  (M-N)  degrees  of  freedom.  The  test 
T T 

Hq:  £ x = cq  against  H ^ : £ x t cq  can  then  be  performed  by  the  use 
of  a Student  distribution  table.  It  should  be  remarked  that  for  a 


sample  size  larger  than  40,  the  test  can  be  performed  by  using 
tables  of  normal  distribution.  The  procedure  is  entirely  analogous 
to  the  case  of  known  variance. 

Suppose  now  that  the  problem  consists  in  testing  the  unknown 

2 2 

variance  of  the  noise.  Let  a = a be  the  H hypothesis  and 

o o 

2 2 

a ^ ct^  be  the  alternative  hypothesis.  In  the  section  concerning 

(M-N)s^ 

confidence  intervals  it  was  pointed  out  that  the  statistics  * ^ 

a 

is  chi-square  distributed  with  (M-N)  degrees  of  freedom.  Under 

(M-N)s^ 

the  H hypothesis  this  quantity  becomes  i , and  the  test,  for 

° 'o 

a given  level  of  significance,  can  be  performed  using  tables  of  the 

chi-square  distribution.  More  specifically,  using  a 100  a percent 

significance  level,  the  thresholds  a and  b of  this  two-sided  test 

a a 

are  chosen  satisfying  the  relationships 


P[  X2(M-N)  < a J = P[X2(M-N)  > b ] = -y  (A-8) 

CL  CL  £ 
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The  hypothesis  is  accepted  if  the  value  of  ~^s  falls  between 

rr 

o 

a^andb^.  Otherwise  the  hypothesis  is  rejected. 

Consider  now  the  problem  of  testing  the  whole  vector  x of 
pixel  values.  In  order  to  simplify  the  discussion,  it  will  be  assumed 
initially  that  the  hypothesis  specifies  the  zero  vector,  denoted  by 
£.  The  purpose  of  the  test  is  to  verify  whether  the  whole  picture  is 
zero  against  the  alternative  that  it  is  nonzero. 

The  following  discussion  will  make  use  of  three  quantities, 
denoted  by  Qj,  Q^,  and  Q^,  that  will  be  defined  now.  will 
measure  the  variation  of  the  observed  data  around  the  hypothesized 
regression  line,  specified  by  the  hypothesis  H 

o 

Q ! = [l-HO]T[X-HO]  = n_Tn  (A-9) 

^2  measure  the  variation  of  the  observed  data  around  the  regres- 
sion line  obtained  by  estimating  the  regression  coefficients  irrespec- 
tive of  whether  or  not  H is  true. 

o 

Q2  = [2-Z?x]T[£-Hx]  (A-10) 

Using  the  notation  defined  previously,  it  follows  that 

n - - t Tt  t T _ 

U2  ~ ~ Xk  iX  = vLv  = n Ln 


(A-l  1) 
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will  denote  the  variation  of  the  estimated  regression  line  around 
the  hypothesized  regression  line,  that  means 


Q3  = [Hi -HoJT[Hx -H  o] 

= (x- o)T(HTH)(x-o)  (A- 12) 

T T 

= x(HH)x 


Now  since 


. T-1T  T _ 1 T1 

X = (H  AH)  H i = (HAH)  H (H  x +n)  = 

T -1  T 
x + (H  H)  H n_ 


(A- 13) 


for  x = 0 it  follows  that 


T -1  T 
£ - (H  H)  H n 


(A- 14) 


So  that  (A- 12)  can  be  put  into  the  form 


T T-1T  T-1T 
Q3  = n H(H  H)  (H  H)(H  H)  H n 

T T -1  T 

= n H(H  H)  H 1 n (A- 15) 

= nT{I_- L)n 


Now,  by  observing  expressions  (A-9),  (A- 11)  and  (A- 15)  to- 


gether with  the  observations  that 
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rank  I_  = M 
rank  L = M-N 
rank(I_-L)  = N 


(A  - 1 6) 
(A  - 1 7) 
(A  - 1 8) 


Q]  2 Q Q 

It  follows  that  — Is  x.  (Ml,  -i-  ls  x2(M-N)  and  Is  X.2  (N) 
0 o 0Z 

distributed.  Also,  it  is  easy  to  verify  that 


Q1  = Q2+Q3 


fA- 19) 


Furthermore,  by  Cochran's  Theorem  [A-l,  pages  212-214],  Q and 

2 

Q3  are  independent  random  variables. 

From  the  definitions  of  Q2  and  Q3  it  should  be  clear  that  if 
Q3  is  large  compared  to  Q ^ this  would  lead  to  the  belief  that  the  hy- 
pothesis Hq  is  not  true.  In  fact,  this  would  mean  that  the  hypothe- 
sized regression  line  is  far  from  the  estimated  regression  line  as 
compared  to  the  variation  of  the  data  around  the  estimated  regression 
line.  Under  this  perspective,  the  statistics 


F = ■ 


_3_  M-N 
L N 


(A-20) 


is  appropriate  for  the  testing  procedure  [3-5,  page  96].  This  ratio 
is  distributed  according  to  the  Fischer  distribution,  with  N and  M-N 
degrees  of  freedom.  The  null  hypothesis  Hq  is  rejected  when  this 
ratio  exceeds  the  significance  limit. 
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So  far  the  test  has  been  developed  assuming  that  the  null  hy- 
pothesis specifies  the  zero  vector.  When,  in  general,  a nonzero 
pixel  vector  x ^ is  specified,  is  given  by 


Q,  = [Hx-Hx  ] [Hx-Hx  ] 


(A -21) 


and  still,  after  some  manipulation,  the  previous  expression  assumes 
the  form  of  equation  (A -15).  Since  Q2  is  not  affected  by  the  form  of 
the  null  hypothesis,  it  remains  the  same.  Therefore,  expression 
(A -20)  can  still  be  applied. 

In  some  circumstances  the  experimenter  might  be  interested 
in  testing  only  part  of  the  total  number  of  pixels  in  a picture.  This 
would  happen  when  it  is  known  that  a certain  object  to  be  detected 
could  occur  only  at  a certain  location  on  the  overall  picture.  In  this 
case,  the  previous  test  can  be  modified  to  take  into  account  this 
special  feature.  By  performing,  if  necessary,  a reordering  of  the 
pixel  values,  the  vector  x can  be  subdivided  into  two  subvectors. 


(A -22) 


Suppose  that  the  subvector  of  interest  is  x2  and  that  the  Hq  hypothesis 
consists  in  testing  x2  = x^  against  x2  i x^.  Under  this  partition  the 
linear  model  is  given  by 


* = GilH  )(  x ) + n 


(A-23) 


where  x is  an  (N  x 1)  vector,  x an  (N_  x 1)  vector  such  that  N + N = 
11  L L 12 

N and  and  H2  are,  respectively  (MxNj)  and  (Mx^)  matrices. 
Under  this  partition,  the  estimator  will  have  the  form  [3-5,  page  100l 


A = 1 *J  = CGi1M2)T(H1H2)]‘1(H1H2^ 


H,1'",  h/hV 

S-2«l  H^H2/ 


H,  I 


^ T - 1 T T-1t  it>  ' 

(Hj  Hj)  Hj  x - (H/Hj)  SjH^E  1 —2  — i 

-1  T 

, E H2LlX  / 


(A -24) 


Lj  = I- 


(A-25) 
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-1  T 

E 


(A -2  7) 


Comparing  (A-27)  with  (A-13)  and  taking  into  consideration 
(A-26),  it  follows  that  in  this  case  the  quantity  Q3  of  interest  for 
testing  has  the  form 


where  x^ 
has  a X. 


Q,  = (£,  -*»> 


(A -28) 


Q. 


= —2 


is  the  null  hypothesis.  Since  E.has  the  rank  N2*  2 


a 


distribution  with  N2  degrees  of  freedom  and  the  ratio 


°3  M-N 

F = Q * 

2 2 


(A-29) 


will  be  distributed  according  to  the  Fisher  distribution  with  I»2  and 
(M-N)  degrees  of  freedom.  A large  value  of  this  ratio  leads  to  the 

rejection  of  the  null  hypothesis. 

Several  other  tests  could  be  devised  concerning  the  pixel 
values  of  an  image  or  their  linear  combinations.  For  example,  sup- 
pose that  the  problem  consists  in  detecting  whether  there  is  a dif- 
ference between  the  gray  level  of  two  parts  of  a picture.  This  is  the 
so  called  edge  detection  problem.  Two  versions  of  this  question  can 
be  formulated.  In  the  first  one,  the  null  hypothesis  specifies^  = *2 
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wnere  and  x2  are  partitions  of  x but  the  common  value  is  unknown. 
In  the  second  version  the  common  value  is  known  and  represented  by 
a vector,  say  x*.  In  the  field  of  econometrics  tests  of  this  type  have 
been  called  tests  for  structural  stability  [ 3-5,  page  103-116]. 

Before  closing  this  appendix  some  observations  should  be 
made.  First,  even  though  no  justification  was  given  for  the  statistics 
used  in  the  tests,  it  can  be  shown  [ 3-6,  page  141]  that  these  tests  can 
be  rigorously  derived  through  the  use  of  likelihood  ratios,  using  some 
suitable  criterion,  like  the  Neyman- Pear  son  for  example. 

Second,  the  examples  of  tests  of  pixel  values  or  their  linear 
combinations  can  be  put  into  the  framework  cf  a general  linear  hy- 
pothesis represented  by  the  expressions 

Ho  : = L ’ Hj  : Rx/r  (A-30) 

where  R and  r are  matrices  of  dimensions  Q x M and  Q x 1,  re- 
spectively, the  rank  of  JR  being  Q.  A test  procedure  [ 3-6,  page  143] 
can  be  developed  for  this  general  case,  leading  to  the  previous  cases 
for  specific  choices  of  the  matrices  R and  r.  Reference  [ 3-3,  pages 
100-104]  presents  broader  results,  encompassing  the  cases  where 

the  rank  of  R may  be  not  Q and  the  underdetermined  model,  where  the 
rank  of  Id  may  not  be  N. 

Further  tests  could  be  devised,  using  the  large  body  of 
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material  developed  so  far  in  regression  analysis.  As  examples, 
could  be  mentioned  a test  to  verify  whether  the  white  noise  corrupting 
the  image  has  constant  variance  or  the  test  to  verify  if  any  degree  of 
correlation  exists  between  the  noise  elements  corrupting  different 
pixel  values.  This  last  procedure  would  be  based  on  the  so  called 
Durbin-Watson  test  developed  in  econometrics  [3-6,  pages  199-201}. 
Also,  tests  to  verify  the  hypothesis  of  linearity  of  the  blurring  process 
could  be  used,  being  based  on  the  von  Neumann  ratio  [3-6,  pages  222- 
224]. 
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APPENDIX  B 


COMPUTATION  OF  THE  NUMBER  OF  OPERATIONS 
FOR  DIFFERENT  METHODS  OF  RESTORATION 


In  this  appendix,  the  number  of  operations  involved  in  the 
computation  of  the  estimators  in  the  restoration  problem  is  presented. 
The  unconstrained,  linear  equality  and  linear  inequality  constrained 
methods  are  considered.  It  is  assumed  that  the  first  two  methods  use 
nonrecursive  forms  of  the  estimators  and  the  overdetermined  model 
is  involved. 

The  unconstrained  solution  is  obtained  by  a simple  matrix- 

2 

vector  multiplication.  If  the  observed  vector  ^ is  (M  x 1)  and  the 

2 

vector  of  pixel  values  x is  (N  x 1),  the  solution  would  involve 
2 2 

M x N multiply  and  add  operations. 

Expression  (4.2-2)  gives  the  solution  for  the  linear  equality 
constrained  restoration.  Assuming  that  P linear  constraints  are  in- 
corporated, the  number  of  operations  involved  in  obtaining  the  solution 
can  be  computed  as  follows: 


2 2 

M x N mult,  and  add  for  x 


N x P mult,  and  add  for  Ax 


P add  for  t - Ax 
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o 

N‘‘  x P mult,  and  add  for  K(t_-  Ax)  = x 

where  K = (HTC_  1H)"1AT[A(HTc"  1H)“1AT] 

N add  for  x + x 

Total:  N2(2P  + M2)  mult,  and  add  + (P  + N2) 

add 

The  computation  of  the  number  of  operations  involved  in  the 
solution  of  the  quadratic  programming  problems  for  inequality  con- 
strained restoration  is  more  complex  and  only  an  estimated  value  can 
be  given.  Suppose  that  the  problem  consists  of  minimizing  the  quad- 
ratic form  given  by  (4.  3-4)  subject  to  the  constraints  expressed  by 
(4.  3-2)  and  (4.  3-3).  Wolfe's  procedure  T 4-61  consists  essentially  in 
solving  the  equations  that  express  the  Kuhn  Tucker  cond_tions. 

The  procedure  consists  of  two  steps:  in  the  first  one,  three 
nonnegative  slack  vectors,  m,  aid^2  are  and  the  step 

consists  in  finding  a set  of  vectors  x :>  0,  z^Oandw  and  the  slack 
vectors  that  solve  the  problem  of  minimizing  the  objective  function 


225 


_„*T  -1  * * T *T  -1 

2H  V Hx  - z + B w - 2H  1 v 1 


— — Hx-z^  + Bw  - 2H  V y_  +_t.  - _t  =0  (B-3) 

1 2 


T * 

z x =0 


(B-4) 


where  the  notation  defined  by  equations  (4.3-5)  to  (4.3-13)  is  used. 
Except  for  the  last  restriction,  this  is  a linear  programming  problem. 
The  procedure  is  very  much  similar  to  the  simplex  method,  with  some 
modification  to  take  into  consideration  the  nonlinear  restriction. 

The  minimum  of  (B-l)  is  attained  for  m = 0 and  the  second 
step  consists  of  minimizing  the  objective  function 


2N2 
i=i  1 


(B-5) 


subject  to  the  constraints 


* 

Bx  = v 


(B-6) 


_*T , -1  * * l 

2H  V H X - z + B w - 2H ^v"1^  + E^=  0 (B-7) 


T * 

z x =0 


(B-8) 


where  t_  contains  the  remaining  nonzero  components  of_t  and_t  and  E 

* 2 

is  the  corresponding  coefficient  matrix. 


The  number  of  operations  of  each  step  will  be  computed  as  if 


"t' imf  Miwr  - 
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they  were  standard  simplex  methods.  This  is  evidently  an  approxi- 
mation but  it  should  be  a reasonable  one  since  a few  extra  operations 
are  necessary  to  impose  conditions  (B-4)  or  (B-8). 

Under  this  assumption,  the  number  of  operations  for  each  step 
is  estimated  [4-5,  page  55]  as  1.5x  (number  of  linear  equality  re- 
strictions) pivot  operations.  Since  in  each  step  the  number  of  linear 

2 

equality  restrictions  is  given  by  (S  + 3N  ),  it  follows  that  the  total 

2 

number  of  pivot  operations  should  be  about  3x(S  + 3N  ). 

A pivot  operation  in  linear  programming  comprises  three 
series  of  suboperations: 

a)  Determination  of  the  column  of  the  pivot,  that  is,  the  non- 
basic  variable  that  will  enter  the  basis.  This  is  done  by  a search  of 
the  most  negative  reduced  cost  coefficient  among  the  nonbasic  vari- 
ables. 

b)  Division  oZ  all  the  elements  of  this  column  by  the  corres- 
ponding right  hand  side  elements  and  search  for  the  smallest  non- 
negative quocient.  This  operation  will  determine  the  pivot. 

c)  The  operation  of  pivoting,  that  is,  reducing  to  one  the  pivot 
element  and  to  zero  all  other  elements  of  the  pivot  column  by  elemen- 
tary row  operations  and  updating  the  entire  simplex  tableax. 
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